PhotoGuard: A Tool to Prevent AI Image Manipulation

Category Artificial Intelligence

tldr #

MIT researchers have created a tool called PhotoGuard which prevents AI image manipulation. It does so by encoding images with imperceptible secret signals that alter how the AI model interprets them, or by disrupting the way the models generate images. This tool can be used to protect against malicious AI image manipulation, such as nonconsensual deepfake pornography.

content #

Remember that selfie you posted last week? There’s currently nothing stopping someone taking it and editing it using powerful generative AI systems. Even worse, thanks to the sophistication of these systems, it might be impossible to prove that the resulting image is fake.

The good news is that a new tool, created by researchers at MIT, could prevent this. The tool, called PhotoGuard, works like a protective shield by altering photos in tiny ways that are invisible to the human eye but prevent them from being manipulated. If someone tries to use an editing app based on a generative AI model such as Stable Diffusion to manipulate an image that has been "immunized" by PhotoGuard, the result will look unrealistic or warped.Right now, "anyone can take our image, modify it however they want, put us in very bad-looking situations, and blackmail us," says Hadi Salman, a PhD researcher at MIT who contributed to the research. It was presented at the International Conference on Machine Learning this week.PhotoGuard is "an attempt to solve the problem of our images being manipulated maliciously by these models," says Salman. The tool could, for example, help prevent women’s selfies from being made into nonconsensual deepfake pornography.The MIT team used two different techniques to stop images from being edited using the open-source image generation model Stable Diffusion.

PhotoGuard works by encoding an image with secret signals that alter how they’re processed by an AI model

The first technique is called an encoder attack. PhotoGuard adds imperceptible signals to the image so that the AI model interprets it as something else. For example, these signals could cause the AI to categorize an image of, say, Trevor Noah as a block of pure gray. As a result, any attempt to use Stable Diffusion to edit Noah into other situations would look unconvincing.

The second, more effective technique is called a diffusion attack. It disrupts the way the AI models generate images, essentially by encoding them with secret signals that alter how they’re processed by the model.By adding these signals to an image of Trevor Noah, the team managed to manipulate the diffusion model to ignore its prompt and generate the image the researchers wanted. As a result, any AI-edited images of Noah would just look gray.

The tool is able to successfully prevent AI manipulation of an image even when the stimulation structure changes

Tools like PhotoGuard change the economics and incentives for attackers by making it more difficult to use AI in malicious ways, says Emily Wenger, a research scientist at Meta, who also worked on Glaze and has developed methods to prevent facial recognition.

"The higher the bar is, the fewer the people willing or able to overcome it," Wenger says.

A challenge will be to see how this technique transfers to other models out there, Zhao says. The researchers have published a demo online that allows people to immunize their own photos, but for now it works reliably only on Stable Diffusion.

The research team has also created a demo online that allows users to immunize their own photos with PhotoGuard

In theory, people could apply this protective shield to their images before they upload them online, says Aleksander Madry, a professor at MIT who contributed to the research. But a more effective approach would be for tech companies to add it to images that people upload into their platforms automatically, he adds.

It’s an arms race, however. While they won’t go into details, the team is working on refining the technique so that it’s harder for attackers to break it.

PhotoGuard could also be used by tech companies to add it to images that people upload to their platforms automatically

hashtags #
worddensity #