OpenAI's state-of-the-art image generation model with arbitrary resolution up to 4K and strong instruction following
Settled by community votes across 1 shared challenge, with an AI judge weighing in on each.
GPT Image 2
#3 of 44 in Text-to-Image
Not enough comparable category data
The chart appears once both models have ratings across at least three shared arena categories.
Qwen Image Max
#31 of 44 in Text-to-Image
Where the votes landed
GPT Image 2
0%
win rate
Ties
0%
Qwen Image Max
0%
win rate
Challenge by challenge
The strongest take from each model on every shared challenge, with the AI judge's read.
Magic Burger Explosion: Fiery Photorealism Challenge
Text-to-Image“Ad for 'Magic Burger'. Dynamic, exploded burger with all components (bun, patty, cheese, lettuce, tomato, sauce) suspended in mid-air. Emphasize photorealistic detail and a sense of motion. Dark, fiery background with glowing embers. Integrate text: 'MAGIC BURGER' as a prominent title, 'LIMITED TIME ONLY' as a secondary message, and '€6.99' in a starburst, all rendered with a fiery, glowing effect.”
AI Judge Analysis
GPT Image 2
- + Excellent exploded view showing all individual ingredients suspended in mid-air
- + Perfect text rendering with high adherence to the fiery/glowing requested style
- + High photorealistic detail on the patty texture and sauce droplets
- − Composition is slightly crowded on the left side with the text overlays
Qwen Image Max
- + Strong composition with a centralized focus and good depth of field
- + Clean and legible text placement
- + Good motion blur effects on the flying embers
- − Failed the 'exploded burger' requirement, as the buns and patty are mostly touching
- − Text missing the requested fiery glow effect on the price starburst
- − Some ingredients (like the tomato slice) look less realistic and more like 3D renders
Verdict: GPT Image 2 followed the prompt much more accurately, particularly regarding the 'exploded' nature of the burger and the specific fiery styling of the various text elements. Qwen Image Max produced a high-quality advertisement, but it failed to separate the burger components and used standard gradients for some of the text instead of the requested glowing fire effect.
Explore each model
The Max series of Tongyi Qwen’s image generation model excels across a wide range of generation tasks. Compared with the Plus series, it significantly reduces the “AI-like” feel in generated images, enhancing their realism. It delivers more lifelike material textures for human subjects, finer and more detailed natural textures, and more visually appealing text rendering.