The new chatgpt image generator solves one of the biggest problems of the text in the image

OpenAI has updated Chatgpt With a new image generator with improved capabilities about Dall-e 3 and native in the language model GPT-4Oinstead of being another llm that he resorts to create them. The most striking thing is that it solves, or at least improved significantly, two of the usual limitations in the generative AI: the correlation between different objects and the text representation.

In the market the image models abound in image capable of creating them, generally more comparable to illustrations than what we understand by photographswith notable results. However, they all present Difficulties when generating images that include text, logos and other common elements in everyday life.

Openai states that the new generation of images with GPT-4o solves these limitations, since it can Render text with precision and follow the user’s prompts more precisely Thanks to the use of your knowledge base and the context of the chat. In addition, this new model allows modify images up by the user or create new using one that loads as an initial inspiration.

Image created by chatgpt.OpenAI

The other aspect in which the function stands out Create an image It is in the correlation or link between multiple elements in an image. As indicated by the OpenAi spokeswoman, Taya ChristiansonThe Verge, most models suffer when they are asked to specifically create a series of objects in an image, liking with colors and shapes from 5 or 8. GPT-4O can now maintain the correlation of attributes with up to 15 or 20 objects, without being confused.

Best text and correlation rendering between elements with the new chatgpt image generator.
Best text and correlation rendering between elements with the new chatgpt image generator.OpenAI

This GPT-4O image generation model is already being implemented for all users of Chatgpt plus, pro, team and soon the free accounts. In the latter case, the use limit will be the same as with Dall-E, about 3 images a day, also depending on the demand.

So, GPT-4O becomes the predetermined image generator in chatgptinstead of Dall-e 3, allowing to customize the images specifying the appearance ratio, exact colors by hexadecimal codes or a transparent background. Openai also plans to take this new model to users of Chatgpt Enterprise and Edu In the coming weeks.

The new model is also available in Sora For the creation of images and through the dedicated tool Dall-e GPT. For developers, the generation of images through the GPT-4O API will be implemented in the coming weeks.

Chatgpt’s limitations creating images

It is not really a limitation, but the processing time, since it creates more detailed images, can be extended up to a minute. The limitations that Openai has identified and plans to correct in the coming months are:

  • Can excessively cut long imageslike posters, especially at the bottom.
  • The generation of images can Invent informationespecially using low -context prompts.
  • When images are generated based on your knowledge base, you can have difficulties to represent more than 10-20 different concepts at the same timeas a complete periodic table.
  • Sometimes it presents Problems by rendering non -Latin languagesshowing incorrect or invented characters, especially in cases of greater complexity.
  • Applications to edit specific parts of an image, such as correcting typographic errors, They are not always effective and can modify other unwanted areas or introduce new errors.
  • The model has difficulty representing Detailed information in very small sizes.

All images generated with this new model will include metadata C2PA and the internal OpenAI tool can verify if an image was generated using this model.