The Chinese start-up Deepseekwhich this week has broken into the club of artificial intelligence companies such as an elephant in a pot, has presented A new artificial intelligence model that understands and generates images called Janus-Pro. According to the company itself, Janus-Pro also comes to measure with the largest and surpasses models like Dall-e of Openai and Stable Diffusion From Stability aiamong others.
Janus-Pro is an update of Januslaunched by Depseek at the end of last year. It can be considered a family of models, since it is available for download In 2 sizes, with 1,000 million parameters and 7,000. The standard is usually that, the greater the number of parameters, more refined results in their answers, but at the cost of requiring greater computing power. As a reference, Openai has never revealed the number of millions of Dall-E 2 and Dall-E 3 parameters, the last update, But with Dall-e 1 they were at 12,000 million.
The Chinese model has a MIT licenseso in addition to being of open source It can be modified and use commercially without restrictionsprovided that the original copyright is maintained. Who uses it must also accept a Deepseek license that prohibits its military use or for misinformation. Janus-Pro is available on Huggingface and Github platforms.
The company explained that Janus-Pro is based on an ‘self-regressive frame’ that Separate visual coding processes, interpretation and generation, while maintaining a unified transformer architecture for processing. This ‘not only relieves the conflict between the functions of the visual encoder in understanding and generation, but also improves the flexibility of the frame.’
Depseek says that in reference tests for artificial intelligences evaluation Geneval and DPG-BENCHthe largest model of Janus-Pro, Janus-Pro-7bsurpasses Dall-e 3, as well as models such as Pixart-Alpha, EMU3-GEN and STABLE DIFFUSION XL. Although it presents limitations, such as the ones that the images that it understands and generates have a maximum resolution of 384 x 384 pixels, the result is remarkable and highlights all the detail they show, as can be seen in the images that accompany the documentation of Janus-Pro .
‘Janus-Pro exceeds the previous unified model and equals or exceeds the performance of specific models of tasks’, explains Depseek in a publication in Hugging Face. ‘The simplicity, high flexibility and efficacy of Janus-Pro make it A strong candidate for the next -generation multimodal models‘.
The earthquake that has caused Deepseek has been due to the Low computing cost used to achieve results comparable to those of Silicon Valley companieswhich has questioned that the multimillion -dollar investments they make are the only way to win the artificial intelligence career.
However, in this case Depseek has not made specific references to the economic cost of your training. Yes, it indicates in the documentation that the smallest model had a training of 9 days wearing 128 A100 graphics cards of Nvidia. The largest model with 7,000 million parameters, 14 days and 256 A100.
Nvidia, however, has Export to China of its processors for the most advancedamong which is the A100 GPU, since August 2022. According to The Verge, Depseek, founded in 2023, emerged from a coverage fund founded by engineers from the University of Zhejiang and the current CEO of the company, Liang Wenfeng , acquired thousands of Nvidia GPUS before the prohibition enters effect.