Wan 2.6 Alibaba Wan 2.7 Alibaba

Settled by community votes across 2 shared challenges, with an AI judge weighing in on each.

Wan 2.6

23.4 arena score

#23 of 44 in Text-to-Image

Best Image-to-Video right now

Skill signature

Not enough comparable category data

The chart appears once both models have ratings across at least three shared arena categories.

Wan 2.7

19.0 arena score

#34 of 44 in Text-to-Image

Vote tally

Where the votes landed

Wan 2.6

0.0%

win rate

Ties

100.0%

Wan 2.7

0.0%

win rate

0.0% 100.0% ties 0.0%

Shared challenges 2

Challenge by challenge

The strongest take from each model on every shared challenge, with the AI judge's read.

Chalkboard Menu

Text-to-Image

“Handwritten-style chalkboard menu in a cozy café, all text rendered in the exact same realistic chalk handwriting style with natural variations in letter size, slight slant, and chalk texture — no printed or digital fonts anywhere on the board. Title at the top in elegant cursive chalk handwriting: ‘TODAY’S SPECIALS – APRIL 30, 2026’. Below it, three menu items also in the same handwritten chalk style: ‘Truffle Mushroom Risotto – $24’, ‘Grilled Octopus with Lemon & Herbs – $28’, ‘Brown Butter Chocolate Chip Cookies – $9’. At the very bottom, smaller text in the identical handwritten chalk style (slightly smaller but still clearly legible with the same handwriting characteristics): ‘All items made fresh daily • Ask about our gluten-free options’. Warm ambient café lighting, visible chalk dust and smudges, realistic handwriting imperfections, no clean printed text anywhere.”

Wan 2.6

Wan 2.7

AI Judge Analysis

Wan 2.6

+ Excellent chalk texture with realistic smudges and dusty residue
+ Truly handwritten appearance with natural variations in letter size and slant
+ Perfect text accuracy including the clipped prompt text completion

− The 'cursive' requirement for the title is only partially met with print/cursive hybrid letters

Wan 2.7

+ Perfect text rendering without any spelling errors
+ Clean and centered composition
+ Attractive cafe-style background lighting

− Text looks like a digital font rather than natural chalk handwriting
− Fails the 'no printed or digital fonts' requirement
− Lacks the specific chalky texture and variations requested in the prompt

Verdict: Wan 2.6 is the clear winner as it successfully captured the 'handwritten' and 'chalk texture' requirements, appearing like an authentic chalkboard. Wan 2.7, despite having perfect legibility, used a clean digital-looking font that ignored the instructions for natural handwriting and chalk variations.

The Capybara Taxi Driver

Text-to-Image

“Photorealistic scene inside a yellow New York taxi at night. A capybara is driving, wearing a yellow taxi driver cap and a dark jacket. It has a calm, professional expression and both front paws on the steering wheel. In the back seat sits a human businesswoman in a coat, looking at her phone with a completely normal, bored expression (as if this is just another normal ride). Through the windows you can see the streets of Manhattan at night with blurred lights. Realistic taxi interior, photorealistic, detailed fur and fabric, 35mm lens, night lighting with reflections, shallow depth of field.”

Wan 2.6

Wan 2.7

0% wins 100% ties 0% wins

AI Judge Analysis

Wan 2.6

+ Excellent photorealism with realistic low-light noise and rain textures
+ Captures the 'bored' expression of the passenger perfectly
+ The capybara's pose and anatomy look more integrated with the driver seat

− The capybara's paws are somewhat indistinct and blend into the steering wheel

Wan 2.7

+ Clearer depiction of the capybara's paws on the steering wheel
+ Very sharp image resolution and clean lighting

− The capybara's fur has a slightly artificial, 'rendered' look compared to Model A
− The passenger is positioned awkwardly close to the driver, making the backseat feel like a front seat

Verdict: Wan 2.6 is the winner because it achieves a much higher level of cinematic photorealism, particularly in the lighting and atmosphere of a New York taxi at night. While Wan 2.7 has clearer details on the paws, the spatial arrangement of the car interior is confusing, whereas the first image perfectly captures the requested bored expression and realistic depth.

Next steps

Explore each model

Wan 2.6

Alibaba

Alibaba's multimodal generation model from the Wan AI suite, supporting text-to-video, image-to-video, reference-to-video with audio, and text-to-image, in both Chinese and English

Vote this model in the arena

Arena profile Lumenfall catalog

Wan 2.7

Alibaba

Alibaba's Wan 2.7 image generation and editing model for text-to-image, reference-guided generation, and instruction-based image edits

Vote this model in the arena

Arena profile Lumenfall catalog