AI that 'reasons' is here: OpenAI launches a new model that you can now use in ChatGPT

StrawberryOpenAI’s rumored new reasoning-focused artificial intelligence, is now a reality. OpenAI launched it yesterday under the name OpenAI o1 and as a new option to choose from among the language models that paying users of OpenAI have access to. ChatGPT. The company of Sam Altman has trained this model to solve problems on its own using a technique known as reinforcement learningwhich teaches the system through rewards and penalties, and what OpenAI calls ‘chain of thought‘ to process queries similar to how humans process problemsreviewing them step by step. The result is longer waiting times to get answers, but in exchange o1 ‘can reason through complex tasks and solve more difficult problems than previous models in science, coding and math‘.

As the company explains on its blog, ‘we train these models to spend more time thinking about problems before responding, as a person wouldThrough training, they learn to hone your thinking process, try different strategies and acknowledge your mistakes’. The plural is because o1, which is available in preview and will receive regular updates, is accompanied by a second language model that is also already in ChatGPT Plus, o1 Minia version Lighter, faster and cheaper than o1.

An AI with fewer hallucinations and that performs like a PhD student

One benefit that users will notice is that OpenAI o1 hallucinates less than previous models like GPT-4o.We have noticed that this model is less hallucinating.has pointed out Jerry Tworekhead of research at OpenAI, told The Verge, but ‘we can’t say we solved the hallucinations.’

According to tests carried out by OpenAI, the new model performs similarly to a PhD student in physics, chemistry and biology tasks. It also stands out in mathematics and programming. Compared to GPT-40, which until yesterday was the company’s most advanced model, it manages to solve the 13% of the problems in a qualifying exam for the International Mathematical OlympiadOpenAI o1 reaches the 83%. In programming, he achieved a 89% in Codeforces competitions.

An alternative to GPT-4o, not a replacement

However, OpenAI does not present o1, which has been trained on a different dataset, as a substitute for GPT-4o, but rather as an alternative. Recommended for tasks in the fields mentioned, but not so much for other more common ones. GTP-4o is better at text tasks and, above all, fasterIn addition, the new model lacks capabilities that this one does have, such as browsing the Internet to search for information or uploading images and files to the chatbot.

So why all the fuss about its reasoning capabilities? What sets o1 apart is its ability to carry out complex processes requiring several steps and that is what makes it take longer to prepare your answers. This, depending on the complexity, can vary from a few seconds to more than a minute in truly demanding matters. During this process, AI Review the task several times discarding errors in the result until it finds one that it considers to be free of them. This is what OpenAI means when it talks about what is essentially a predictive model of words having ‘advanced reasoning capabilities‘ and equates them to those of a human.

Harder to break

The company also assures that this new model of artificial intelligence is harder to ‘break’which is known as ‘jailbreaking’, and getting it to behave differently than OpenAI wants. In a particular Jailbreak test in which GPT-4o scores 22 On a scale of 1 to 100, o1 reaches 84.

This improvement is due to OpenAI 01 ‘can reason about our security policies in context when responding to potentially insecure requests‘ and ‘apply them more effectively’. For OpenAI, ‘Training models to incorporate a chain of thought before responding has the potential to unlock substantial benefits, while increasing the potential risks that come with increased intelligence.‘.

More expensive for developers and with usage limits for users

OpenAI o1 is available for users of ChatGPT Plus and Teamand will arrive next week to those of Enterprise and Edu. Has a usage limit which is set at 30 messages per week for o1 and 50 for o1 Mini. OpenAI will be expanding these limits and is also working on allowing it to be ChatGPT automatically chooses which language model is most suitable for the proposed task, rather than the user doing it manually.

The new model is not cheap. Prices for using the APIwhich allows other developers to use AI in their products, are noticeably more expensive than previous models and they cost $15 per million input tokens or text fragments analyzed by the model and $60 per million output tokens. GPT-4o costs $5 and $15 respectively. And this is where o1 Mini comes in, not as capable but cheaper and faster. OpenAI recommends it especially for programming tasks and it is planned to reach free users of ChatGPT as well, although there is no date yet.