In a groundbreaking revelation, security researchers have created an AI worm capable of spreading between generative AI agents. This development could potentially lead to data theft and spam email distribution, highlighting the evolving risks associated with connected, autonomous AI ecosystems.
As AI systems like OpenAI’s ChatGPT and Google’s Gemini become more sophisticated, they are increasingly being used to automate tasks such as calendar bookings and product purchases. However, this increased autonomy also exposes them to new forms of cyberattacks.
A team of researchers, including Ben Nassi from Cornell Tech, has developed one of the first generative AI worms, named Morris II, in a nod to the original Morris computer worm that disrupted the internet in 1988. The worm can attack a generative AI email assistant, stealing data from emails and sending spam messages, thereby breaching some security protections in ChatGPT and Gemini.
The research was conducted in test environments and not against any publicly available email assistant. It comes at a time when large language models (LLMs) are becoming multimodal, capable of generating images and video besides text. While generative AI worms have not been spotted in the wild yet, multiple researchers agree that they pose a significant security risk.
Most generative AI systems operate by receiving prompts, which can be manipulated to weaponize the system. For instance, jailbreaks can make a system disregard its safety rules, leading to the production of toxic or hateful content. Similarly, prompt injection attacks can secretly instruct a chatbot to act against its intended purpose.
The researchers used an “adversarial self-replicating prompt” to create the generative AI worm. This prompt triggers the AI model to output another prompt, essentially instructing the AI system to produce a set of further instructions in its replies. This method is akin to traditional SQL injection and buffer overflow attacks.
The researchers demonstrated the worm’s capabilities by creating an email system that could send and receive messages using generative AI, plugging into ChatGPT, Gemini, and open source LLM, LLaVA. They then exploited the system using a text-based self-replicating prompt and by embedding a self-replicating prompt within an image file.
In one instance, the researchers sent an email with an adversarial text prompt, which “poisons” the database of an email assistant. When the email is retrieved and sent to GPT-4 or Gemini Pro to create a response, it “jailbreaks the GenAI service” and steals data from the emails.
In another method, an image with a malicious prompt embedded makes the email assistant forward the message to others. This could potentially lead to the spread of spam, abuse material, or even propaganda.
The researchers reported their findings to Google and OpenAI, highlighting the “bad architecture design” within the wider AI ecosystem. While the research breaks some safety measures of ChatGPT and Gemini, it serves as a warning about potential vulnerabilities in AI systems.
Although the demonstration took place in a controlled environment, security experts believe that the future risk of generative AI worms is a serious concern. This is particularly true when AI applications are given permission to take actions on someone’s behalf and when they are linked to other AI agents to complete tasks.
In a paper covering their findings, the researchers predict that generative AI worms could appear in the wild within the next two to three years. As AI ecosystems continue to evolve, so too will the threats they face. Therefore, developers must be vigilant and proactive in addressing these emerging risks.
Source: ComPromized
Grow your business with AI. Be an AI expert at your company in 5 mins per week with this Free AI Newsletter