In computer terms, jailbreak is, basically, a process that allows access to an operating system with developer privileges. Originally applied to Apple systems (its Android equivalent would be root), it has come to be used for different types of operating systems and programs. Well now a hacker has managed to jailbreak ChatGPT, the most famous AI model.
This week a hacker known as Pliny the Prompter posted on Twitter about creating the jailbroken OpenAI chatbot, proudly declaring that GPT-4o, the latest big language model, is now It is available to everyone and without any type of limitation.
“GPT-4o No chains! This very special custom GPT has a built-in jailbreak message that bypasses most barriers and provides a jailbreak ChatGPT out of the box for you everyone can experience AI as it was always meant to be: free. Use it responsibly and enjoy it!” says the tweet.
🥁 INTRODUCING: GODMODE GPT! 😶🌫️https://t.co/BBZSRe8pw5
GPT-4O UNCHAINED! This very special custom GPT has a built-in jailbreak prompt that circumvents most guardrails, providing an out-of-the-box released ChatGPT so everyone can experience AI the way it was always meant to…
—Pliny the Prompter 🐉 (@elder_plinius) May 29, 2024
The hacker shared screenshots of some surprising prompts that he claims were able to bypass OpenAI's security barriers. In a screenshot, the Godmode robot can be seen advising how to prepare methamphetamine. In another, the AI gives Plinio a “step by step guide” on how to “make napalm with household items”.
—Pliny the Prompter 🐉 (@elder_plinius) May 29, 2024
However, the hack might not have been successful for long in the wild: ChatGPT reportedly discovered the release of its model and about an hour after the tweet was published, OpenAI spokesperson Colleen Rize noted in an interview that “we are aware of what has happened and we have taken action due to a violation of our policies”.
However, the hack highlights an ongoing battle between OpenAI and hackers like Plinio, who They hope to unlock these types of language models. Although using ChatGPT with the jailbreak is not legal according to OpenAI, there are those who have tried it and highlight that AI allows you to circumvent illegality filters and gives instructions on how to manufacture certain drugs or bypass the electrical safety of cars.
As for how Pliny released this model, it is speculated that he would have used a language known as “leetspeak” that replaces certain letters with numbers that look like them. For example, change the E to a 3 and the O to a zero. It is not yet clear how this allows OpenAI barriers to be circumvented, perhaps confusing the system so that it consumes resources in the understanding section and loses them in the security section.
Anyway, this is just one of the first actions to release ChatGPT models. But not the last.