Distilled version of SD 3.5 Large that generates high-quality images in just 4 steps, offering faster inference and reduced costs
Settled by community votes across 1 shared challenge, with an AI judge weighing in on each.
Stable Diffusion 3.5 Large Turbo
#44 of 44 in Text-to-Image
Not enough comparable category data
The chart appears once both models have ratings across at least three shared arena categories.
Wan 2.6
#23 of 44 in Text-to-Image
Where the votes landed
Stable Diffusion 3.5 Large Turbo
0%
win rate
Ties
0%
Wan 2.6
0%
win rate
Challenge by challenge
The strongest take from each model on every shared challenge, with the AI judge's read.
Chalkboard Menu
Text-to-Image“Handwritten-style chalkboard menu in a cozy café, all text rendered in the exact same realistic chalk handwriting style with natural variations in letter size, slight slant, and chalk texture — no printed or digital fonts anywhere on the board. Title at the top in elegant cursive chalk handwriting: ‘TODAY’S SPECIALS – APRIL 30, 2026’. Below it, three menu items also in the same handwritten chalk style: ‘Truffle Mushroom Risotto – $24’, ‘Grilled Octopus with Lemon & Herbs – $28’, ‘Brown Butter Chocolate Chip Cookies – $9’. At the very bottom, smaller text in the identical handwritten chalk style (slightly smaller but still clearly legible with the same handwriting characteristics): ‘All items made fresh daily • Ask about our gluten-free options’. Warm ambient café lighting, visible chalk dust and smudges, realistic handwriting imperfections, no clean printed text anywhere.”
AI Judge Analysis
Stable Diffusion 3.5 Large Turbo
- + Clean aesthetic and well-lit background composition.
- − Numerous spelling errors including 'risctto', 'apri', and 'ocopas'.
- − Text looks like a digital font rather than natural chalk handwriting.
- − Fails to follow the specific menu item list correctly.
Wan 2.6
- + Perfect adherence to text content with zero spelling errors.
- + Extremely realistic chalk texture with dusty smudges and authentic handwriting variations.
- + Matches the atmospheric 'cozy café' request with depth of field.
- − The perspective is slightly angled rather than straight-on, though it fits the café vibe.
Verdict: Wan 2.6 is the clear winner as it followed every instruction, including the specific date and menu items with perfect spelling and realistic chalk textures. Stable Diffusion 3.5 Large Turbo failed significantly on text rendering, producing gibberish words and a clean digital look that ignored the 'handwritten-style' and 'chalk texture' requirements.
Explore each model
Alibaba's multimodal generation model from the Wan AI suite, supporting text-to-video, image-to-video, reference-to-video with audio, and text-to-image, in both Chinese and English