Qwen Image 2512 Alibaba Wan 2.7 Alibaba

Settled by community votes across 2 shared challenges, with an AI judge weighing in on each.

Qwen Image 2512

22.4 arena score

#26 of 44 in Text-to-Image

Skill signature

Not enough comparable category data

The chart appears once both models have ratings across at least three shared arena categories.

Wan 2.7

19.0 arena score

#34 of 44 in Text-to-Image

Vote tally

Where the votes landed

Qwen Image 2512

win rate

Ties

Wan 2.7

win rate

Shared challenges 2

Challenge by challenge

The strongest take from each model on every shared challenge, with the AI judge's read.

Chalkboard Menu

Text-to-Image

“Handwritten-style chalkboard menu in a cozy café, all text rendered in the exact same realistic chalk handwriting style with natural variations in letter size, slight slant, and chalk texture — no printed or digital fonts anywhere on the board. Title at the top in elegant cursive chalk handwriting: ‘TODAY’S SPECIALS – APRIL 30, 2026’. Below it, three menu items also in the same handwritten chalk style: ‘Truffle Mushroom Risotto – $24’, ‘Grilled Octopus with Lemon & Herbs – $28’, ‘Brown Butter Chocolate Chip Cookies – $9’. At the very bottom, smaller text in the identical handwritten chalk style (slightly smaller but still clearly legible with the same handwriting characteristics): ‘All items made fresh daily • Ask about our gluten-free options’. Warm ambient café lighting, visible chalk dust and smudges, realistic handwriting imperfections, no clean printed text anywhere.”

Qwen Image 2512

Wan 2.7

AI Judge Analysis

Qwen Image 2512

+ Excellent chalk texture with realistic smudges and dust on the board.
+ Authentic handwritten variation in cursive style that looks truly manual.
+ Correct interpretation of a framed tabletop chalkboard in a cafe setting.

− Spelling error in 'Risotto' (spelled 'Risitto').

Wan 2.7

+ Perfect spelling for all requested menu items.
+ Very clear and legible text alignment.
+ Good lighting and background environment detail.

− The text looks more like a digital font or vector asset with a drop shadow rather than genuine chalk on a board.
− The 'handwriting' is too uniform, lacking the natural variation requested in the prompt.

Verdict: Qwen Image 2512 produces a much more authentic 'handwritten' chalkboard look with realistic textures and manual flourishes, despite a minor spelling error. Wan 2.7 has perfect spelling but fails the stylistic requirement, as the text appears as a clean digital overlay rather than realistic chalk.

The Capybara Taxi Driver

Text-to-Image

“Photorealistic scene inside a yellow New York taxi at night. A capybara is driving, wearing a yellow taxi driver cap and a dark jacket. It has a calm, professional expression and both front paws on the steering wheel. In the back seat sits a human businesswoman in a coat, looking at her phone with a completely normal, bored expression (as if this is just another normal ride). Through the windows you can see the streets of Manhattan at night with blurred lights. Realistic taxi interior, photorealistic, detailed fur and fabric, 35mm lens, night lighting with reflections, shallow depth of field.”

Qwen Image 2512

Wan 2.7

AI Judge Analysis

Qwen Image 2512

+ Excellent adherence to the 'bored' facial expression on the human passenger.
+ Correct positioning of the human in the back seat as per the prompt.
+ Highly detailed fur texture and realistic reflection on the windshield.

− The paws on the steering wheel look slightly anthropomorphized and claw-like.
− The camera perspective through the windshield is slightly crowded.

Wan 2.7

+ Natural profile lighting on the capybara's face.
+ Good anatomy on the capybara's paws holding the wheel.
+ Clear view of the Manhattan street environment.

− Failed to place the human in the back seat; she is in the front passenger seat.
− The capybara's fur has a slightly more artificial, repetitive texture.
− The passenger is looking at the phone but is sitting next to the driver, changing the requested dynamic.

Verdict: Qwen Image 2512 followed the prompt's spatial instructions much better, correctly placing the human passenger in the back seat with a distinct 'bored' expression. Wan 2.7 generated a high-quality image but failed on the layout by placing the passenger in the front seat, which misses the intended 'taxi ride' narrative.