GPT Image 2 OpenAI Qwen Image 2512 Alibaba

Settled by community votes across 4 shared challenges, with an AI judge weighing in on each.

GPT Image 2

28.2 arena score

#3 of 44 in Text-to-Image

Top 3 in Text-to-Image

Skill signature

Not enough comparable category data

The chart appears once both models have ratings across at least three shared arena categories.

Qwen Image 2512

22.4 arena score

#26 of 44 in Text-to-Image

Vote tally

Where the votes landed

GPT Image 2

100.0%

win rate

Ties

0.0%

Qwen Image 2512

0.0%

win rate

100.0% 0.0% ties 0.0%

Shared challenges 4

Challenge by challenge

The strongest take from each model on every shared challenge, with the AI judge's read.

Modern Clean Menu

Text-to-Image

“Modern minimalist restaurant menu design, white background with colorful food photos in grid, sections for appetizers/pizza/mains, bold sans-serif fonts, vibrant accents, clean professional layout for casual dining.”

GPT Image 2

Qwen Image 2512

AI Judge Analysis

GPT Image 2

+ Excellent text rendering with legible, coherent English describing realistic dishes.
+ Highly professional layout with expert use of white space and hierarchy.
+ Direct adherence to the section requirements (Appetizers, Pizza, Mains) with relevant high-quality imagery.

− Small logo artifact in the 'Nova' text where it slightly overlaps the red lines.

Qwen Image 2512

+ Accurately reflects the 'grid' request for colorful food photos.
+ Good use of vibrant color accents through the icon system.
+ Appropriate minimalist font choices for a modern design.

− Text is entirely nonsensical and full of AI artifacts/gibberish.
− Layout logic is poor, with photo content not corresponding to the text sections (e.g., pizzas mixed everywhere).
− Section headings are misspelled ('Appetiizers', '/Means').

Verdict: GPT Image 2 (Model A) is the clear winner as it produces a completely usable and professional-grade menu design with perfect English text and logical organization. Qwen Image 2512 (Model B) adheres to the visual grid request but fails significantly on text legibility and the logical placement of food items relative to their headers.

Chalkboard Menu

Text-to-Image

“Handwritten-style chalkboard menu in a cozy café, all text rendered in the exact same realistic chalk handwriting style with natural variations in letter size, slight slant, and chalk texture — no printed or digital fonts anywhere on the board. Title at the top in elegant cursive chalk handwriting: ‘TODAY’S SPECIALS – APRIL 30, 2026’. Below it, three menu items also in the same handwritten chalk style: ‘Truffle Mushroom Risotto – $24’, ‘Grilled Octopus with Lemon & Herbs – $28’, ‘Brown Butter Chocolate Chip Cookies – $9’. At the very bottom, smaller text in the identical handwritten chalk style (slightly smaller but still clearly legible with the same handwriting characteristics): ‘All items made fresh daily • Ask about our gluten-free options’. Warm ambient café lighting, visible chalk dust and smudges, realistic handwriting imperfections, no clean printed text anywhere.”

GPT Image 2

Qwen Image 2512

100% wins 0% ties 0% wins

AI Judge Analysis

GPT Image 2

+ Perfect spelling on all menu items including complex terms.
+ Highly realistic chalk texture with slight smudging and authentic pressure variations.
+ Excellent composition that feels natural for a physical cafe space.

− The slanted handwriting is slightly less decorative than the title request implied.

Qwen Image 2512

+ Clear, legible text with an attractive calligraphic style.
+ Good use of vertical space on a portrait-oriented board.

− Spelling error present in 'Risitto' (should be Risotto).
− Handwriting looks slightly more like a digital font than natural chalk strokes.

Verdict: GPT Image 2 is the superior output because it followed all text prompts perfectly, including difficult spelling, while maintaining a very realistic chalk aesthetic. Qwen Image 2512 had a spelling error in one of the primary menu items and the text style felt more calculated and less like authentic handwriting.

The Capybara Taxi Driver

Text-to-Image

“Photorealistic scene inside a yellow New York taxi at night. A capybara is driving, wearing a yellow taxi driver cap and a dark jacket. It has a calm, professional expression and both front paws on the steering wheel. In the back seat sits a human businesswoman in a coat, looking at her phone with a completely normal, bored expression (as if this is just another normal ride). Through the windows you can see the streets of Manhattan at night with blurred lights. Realistic taxi interior, photorealistic, detailed fur and fabric, 35mm lens, night lighting with reflections, shallow depth of field.”

GPT Image 2

Qwen Image 2512

AI Judge Analysis

GPT Image 2

+ Excellent photorealism with cinematic lighting
+ Natural textures on the capybara's fur and the jacket
+ Composition feels like an authentic candid photograph from the street

− The capybara's paws lack clear claws/definition on the steering wheel
− The 'T' on the cap is a bit generic

Qwen Image 2512

+ Perfectly captures the 'bored' and 'normal' expression of the passenger
+ Composition clearly shows the full interior context and the street ahead
+ Accurate double-paw placement on the steering wheel

− The passenger's hands and phone interaction look slightly distorted
− The capybara's paws look more like human hands wearing gloves than animal paws
− Lighting is a bit flatter and less cinematic than the competitor

Verdict: GPT Image 2 (Model A) wins on sheer visual quality and realism, feeling like a genuine photographic still with beautiful lighting. However, Qwen Image 2512 (Model B) followed the character expression prompts more accurately, specifically regarding the woman's 'bored' look and the specific driving posture. Model A is preferred for its superior artistic execution and more believable integration of the capybara into the environment.

Vintage Cafe Logo

Text-to-Image

“Vintage minimalist restaurant logo for "Caffè Florian", retro cloche dome with steam and "Est. 1720" banner, classic typography, warm brown and cream tones, subtle texture on light background, vector emblem style.”

GPT Image 2

Qwen Image 2512

AI Judge Analysis

GPT Image 2

+ Excellent typography including the correct grave accent on 'Caffè'.
+ Perfectly balanced composition with a professional emblem frame.
+ Clean, vector-style execution that aligns with luxury branding.

− The steam effect is very stylized and simple compared to Model B.

Qwen Image 2512

+ Dynamic and detailed rendering of the cloche and steam.
+ Strong retro vibe with bold script typography.
+ Excellent use of warm brown and cream tones with depth.

− Incorrectly used an acute accent instead of a grave accent on 'Caffé'.
− The composition feels slightly crowded without a containing border.

Verdict: GPT Image 2 is the superior logo as it correctly handles the typography of the name 'Caffè' and provides a more balanced, professional emblem layout. While Qwen Image 2512 has more impressive illustrative detail in the cloche and steam, its spelling error and lack of a framing element make it less effective as a functional logo.