No, DeepSeek’s Janus-Pro-7B Doesn’t Make Better Images Than DALL-E 3.

DeepSeek is an incredible AI lab, but let's not elevate it to godlike status just yet.

Jan 28, 2025

Hot Takes are occasional timely posts that focus on fast-moving news and releases, in addition to my regular Thursday and Sunday columns.
If Hot Takes aren’t your cup of tea, simply go to your account at www.whytryai.com/account and toggle the “Notifications” settings accordingly:

TL;DR

DeepSeek released an image model called Janus-Pro-7B, which some claim makes better images than DALL-E 3 and Stable Diffusion, but it’s nowhere near in my tests.

What is it?

DeepSeek describes Janus-Pro-7B as “a novel autoregressive framework that unifies multimodal understanding and generation.”

In short, this means you can use the same model to process image inputs and generate new images. This makes Janus-Pro-7B quite flexible, combining the capabilities of task-specific models in a single one.

That—and the fact that it’s yet another open-source model—is worthy of praise.

But DeepSeek also shared a few benchmarks that show Janus-Pro-7B outperforming OpenAI’s DALL-E 3 and Stable Diffusion 3 Medium:

Janus-Pro-7B GenEval and DPG-Bench scores — Source: **DeepSeek** on Hugging Face

Here’s the thing: GenEval and DPG-Bench are benchmarks that measure prompt adherence—how well a model follows directions in the text prompt.

They say absolutely nothing about the aesthetic quality of the model.

Still, this didn’t stop the Internet from instantly proclaiming Janus-Pro-7B the new king of images:

Spaces using deepseek-ai/Janus-Pro-7B 19 🌍 deepseek-ai/Janus-Pro-7B 🌍 AP123/Janus-Pro-7b 🦀 blanchon/JanusPro 🌍 NeuroSenko/Janus-Pro-7b 🌍 mkozak/Janus-Pro-7b 🚀 Bils/DeepseekJanusPro-Image 🚀 LLMhacker/DeepseekJanusPro-Image 💻 shakuur/meme 🌍 unography/Janus-Pro-7b 🌍 zx2323/xxxkk 🌍 LLMhacker/Multimodal_Understanding 🌍 omninexus/deepseek-vision + 7 Spaces — Source: **Business Insider**

DeepSeek-AI Releases Janus-Pro 7B: An Open-Source multimodal AI that Beats DALL-E 3 and Stable Diffusion — Source: **Marktechpost**

Wow. DeepSeek just dropped Janus-Pro-7B, an open-source multimodal AI that beats DALL-E 3 and Stable Diffusion. The 🐋 is on fire. 👀 — Source: **Min Choi on X**

And while better prompt adherence is nothing to sneeze at, you also want your image model to create something that looks good.

Unfortunately for Janus, most of what it generates is hot garbage.

Let me show you.

How do you use it?

To test Janus-Pro-7B for yourself, head on over to its Hugging Face page and select any of the “Spaces” using the model on the right-hand side:

I’ll go with the official deepseek-ai/Janus-Pro-7B. Here’s my 2-minute walkthrough:

If you saw the video, you’ll know that Janus-Pro-7B:

Hallucinates nonexistent details in uploaded images.
Creates new images that look like Lovecraftian horrors.

Here are three sample prompts and side-by-side comparisons against DALL-E 3 and Stable Diffusion 3, the models DeepSeek chose to benchmark against:

Prompt: “photo of a samurai in a traditional outfit holding a sci-fi blaster, futuristic skyscrapers with neon signs in the background”

Samurai photos by Janus, SD3 Medium, DALL-E 3 — Left to right: Janus, SD3 Medium, DALL-E 3 (Click an image to view the full-size version.)

Prompt: “cartoon illustration of a blue cat and a green dog wearing party hats, sitting on a park bench and looking up at Saturn”

Cartoon cat and dog by Janus, SD3 Medium, DALL-E 3 — Left to right: Janus, SD3 Medium, DALL-E 3 (Click an image to view the full-size version.)

Prompt: “a cute purple robot holding up a cardboard sign that reads “I can spell better than you!”

Purple robots with signs by DALL-E 3, Janus, and SD 3 Medium — Left to right: Janus, SD3 Medium, DALL-E 3 (Click an image to view the full-size version.)

Janus—how do I put this mildly—sucks ass.1

If I’m being generous, I’ll put Janus on par with Midjourney V2. (Reminder: We currently have Midjourney V6.1 with Midjourney 7 set to come out soon.)

Curiously, DeepSeek chose to benchmark against the outdated DALL-E 3 and Stable Diffusion 3 Medium instead of the newer, better image models, many of which are also way better at prompt adherence and spelling than DALL-E 3 (see my recent test).

Why should you care?

DeepSeek released the truly impressive DeepSeek-R1 reasoning model just a week ago. Since then, this Chinese AI lab has been the talk of the town, getting much-deserved praise for successfully competing with large, high-profile US labs on a “shoestring budget.”

Not only that but DeepSeek is also seen as more open and transparent:

DeepSeek got so much attention that it skyrocketed to the #1 free app spot on the App Store:

Top Charts iPhone iPad Top Free Apps DeepSeek - AI Assistant 1 DeepSeek - AI Assistant 杭州深度求索人工智能基础技术研究有限公司 ChatGPT 2 ChatGPT OpenAI Threads 3 Threads Instagram, Inc. Google Gemini 4 Google Gemini Google TurboTax: File Your Tax Return 5 TurboTax: File Your Tax Return Intuit Inc. Google 6 Google Google — I just tested this. It’s still #1 at the time of writing.

Everybody loves DeepSeek.

So much so that we’re now doing that classic thing where we put a company on a pedestal and automatically give it a free pass.

Let’s not do that.

With DeepSeek feeding so much discourse, we’ll likely see even more hype around it. Some well-earned, some less so.

As always, take it all with a grain of salt and test things yourself where possible.

I certainly can’t rule out that DeepSeek will eventually release competitive image and even video models. If anything, last week has shown it to be an astoundingly capable AI lab.

But let’s not rush to find a new AI darling we can blindly idolize.

🫵 Over to you…

What’s your take? Am I being unreasonably nitpicky? Have you tested Janus and found its multimodality especially helpful for certain use cases?

Leave a comment or drop me a line at whytryai@substack.com.

Why Try AI

Discussion about this post