As if there weren't enough threats in the form of malware and cybercrime, the explosion of generative artificial intelligence brings new ones. The last one is called Morris-II and it is a First generation AI worm which can steal data and spread malware and spam through applications that connect to artificial intelligence models such as GPT-4 or Gemini Pro to offer a certain service. Its name refers to Morristhe first worm to spread over the Internet in 1988, and was designed by a group of researchers from Cornell Tech, the Israel Institute of Technology and Intuit.
The study carried out alerts the dangers that ecosystems carry from agents of artificial intelligence that are being developed around large language models (LLM) such as those from OpenAI or Google.
These ecosystems consist of interconnected networks consisting of clients (in this context, synonymous with program) powered by generative AI (for example, an AI assistant for writing emails) that interact with generative AI services (such as GPT) to process data and communicate with other AI clients generative in the ecosystem.
Unlike viruses, which need a host or program or file that hosts them and to which they are attached to spread, a worm does not. These exploit weaknesses in operating systems, network protocols or applications to copy themselves and spread from one computer to another autonomously.
Morris-II uses to attack what researchers call self-replicating adversarial prompt. Unlike a standard prompt, the request that the user makes to the AI, which returns data in response, the self-replicating adversarial prompt causes the AI to generate another prompt.
The study shows how attackers can encode these types of prompts in both images and text. When the AI processes it, regenerates the malicious prompt so that it continues replicating. Additionally, these instructions can cause other malicious activities in the AI such as distribute spam and propaganda, leak personal data and generate toxic content.
One type of application vulnerable to Worm-II is the one that uses RAG (Recovery-Augmented Generation) to enrich your generative AI queries with context and update your database with new content. With an application of this type, returning to the example of an AI assistant for writing emails, attackers can create an email with a textual self-replicating adversarial prompt that poison RAG-based wizard database.
In this demo, when the message is delivered to the assistant, added to the query, and sent to ChatGPT either Gemini Procircumvents the safeguards of the generative AI service, forces it to replicate the prompt and filters the user's confidential data provided in the query.
The generated response, which contains the user's sensitive data, is subsequently keeps infecting new hosts when used to reply to email sent to a new client and then stored in each client's database. And so on.
Another type of Morris-II attack that the researchers demonstrate is how attackers can create an email with a self-replicating adversarial prompt. embedded in an image. This hidden malicious prompt causes the AI Mail Assistant to forward the image to new contacts, so that any image, with propaganda or spam, continues to be forwarded once the first one has been sent. Vulnerable to this attack are applications that use AI to determine what the next task is.
Before publishing the study, the researchers reported their findings to Google and OpenAI. A spokesperson for the latter told Wired that “they seem to have found a way to exploit fast injection type vulnerabilities relying on user input that has not been verified or filtered” and assured that the company is working to make its systems more secure.