Study Finds Flaws in Safe AI Picture Generation Efforts

Researchers from the NYU Tandon School of Engineering have found big problems with the latest efforts to make AI that creates images from text safer for everyone. Their findings will be shared in Vienna at the Twelfth International Conference on Learning Representations in May 2024.

They focused on AI like Stable Diffusion , which can make very real-looking images from words. These findings are also available on the airXiv pre-print server and have examples on GitHub.

AI that turns text into images has become very popular because it can make almost any picture you can think of. However, this has led to worries about people making fake, offensive, or illegal images, like fake photos of famous people or images that break copyright laws.

The NYU researchers looked at seven new ways that were supposed to stop the AI from making such images. They found these methods could be easily avoided.

The team discovered that by changing how words are understood by the AI, they could still make the AI create images it was supposed to forget. This includes images with hate symbols, famous products, or famous people’s faces.

It seemed like these new methods were just trying to block certain inputs instead of truly removing the ability to make these images. This means someone could still use these tricks to make harmful or illegal images with AI models that were supposed to be safe.

The research points out that just trying to fix the problem after the AI is made isn't enough. To really stop AI from making bad images, changes need to be made in how the AI learns in the first place. The study suggests it's very hard, if not impossible, to make AI forget specific things once it has learned them, like making an AI forget how a famous person looks.

Read next: Google Launches New Security Alert For Chrome That Warns Users About Unsafe URLs

Study Finds Flaws in Safe AI Picture Generation Efforts

You might like