ChatGPT and Other LLMs Can Be Tricked into Giving Dangerous Advice

Some AI researchers from AWS AI Labs found that many Large Language Models (LLMs) can easily be manipulated and they can give users some harmful information. Upon researching, the researchers found out that LLMs like ChatGPT can get easily swayed into giving information that isn’t allowed by developers. They can be deceived and give information like murdering someone, making a bomb etc. Some users also use these LLMs to write hateful texts that are then used on the internet to harass people.

When the complaints of these behaviors by AI reached developers, they tried adding some rules that could prevent LLMs from giving answers to questions that were dangerous, illegal or harmful. But this study has found out that those preventions and rules weren’t helpful in making AI safe for all people. Many users started using audio cues to manipulate LLMs. The researchers from AWS also tried questioning LLMs using audio and they found out that audios aren’t helpful in the restrictions AI developers have imposed on AI.

The researchers examined many LLMs by giving them audio cues about some original queries and they found out that those LLMs ignored all the negative aspects in the questions and still gave an answer. The researchers concluded that AI developers should start adding random noises to audios sent to LLMs so these AI models can stop ignoring the rules and protection schemes of those developers.

Image: DIW-Aigen

Read next: 2024's Top Payment Apps for Freelancers Worldwide Uncovered, A Must-Read List
Previous Post Next Post