OpenAI's previous generation image model with higher quality than DALL-E 2 and support for larger resolutions
Settled by community votes across 4 shared challenges, with an AI judge weighing in on each.
DALL-E 3
#35 of 44 in Text-to-Image
Not enough comparable category data
The chart appears once both models have ratings across at least three shared arena categories.
Wan 2.7
#34 of 44 in Text-to-Image
Where the votes landed
DALL-E 3
0.0%
win rate
Ties
0.0%
Wan 2.7
100.0%
win rate
Challenge by challenge
The strongest take from each model on every shared challenge, with the AI judge's read.
Chalkboard Menu
Text-to-Image“Handwritten-style chalkboard menu in a cozy café, all text rendered in the exact same realistic chalk handwriting style with natural variations in letter size, slight slant, and chalk texture — no printed or digital fonts anywhere on the board. Title at the top in elegant cursive chalk handwriting: ‘TODAY’S SPECIALS – APRIL 30, 2026’. Below it, three menu items also in the same handwritten chalk style: ‘Truffle Mushroom Risotto – $24’, ‘Grilled Octopus with Lemon & Herbs – $28’, ‘Brown Butter Chocolate Chip Cookies – $9’. At the very bottom, smaller text in the identical handwritten chalk style (slightly smaller but still clearly legible with the same handwriting characteristics): ‘All items made fresh daily • Ask about our gluten-free options’. Warm ambient café lighting, visible chalk dust and smudges, realistic handwriting imperfections, no clean printed text anywhere.”
AI Judge Analysis
DALL-E 3
- + Excellent chalk texture throughout the image
- + Authentic café atmosphere with warm lighting
- − Numerous spelling errors in every menu item
- − Illegible text and messy layout
- − Price figures significantly deviate from the prompt
Wan 2.7
- + Perfect text accuracy for all requested items
- + Extremely clear and legible layout
- + Includes a thoughtful completion of the truncated prompt text
- − The text looks more like a digital font than genuine chalkboard handwriting
- − Lack of natural variation and texture in the lettering
Verdict: Wan 2.7 produced a highly accurate and legible result that perfectly followed the complex text requirements, including finishing the partial 'Brown But...' prompt correctly. DALL-E 3 succeeded in creating a much more realistic chalk texture and artistic aesthetic, but failed completely on spelling and legibility. Wan 2.7 is the clear winner for its superior prompt adherence and linguistic coherence.
The Reversed Rodeo
Text-to-Image“Horse riding astronaut in space — horse on top, not vice versa. Surreal, highly detailed, cinematic.”
AI Judge Analysis
DALL-E 3
- + Excellent cinematic lighting and atmosphere
- + Strong artistic rendering of the nebulae and clouds
- + High level of surrealism consistent with the prompt
- − Failed the specific spatial instruction 'horse on top'
- − Astronaut is riding the horse instead of being ridden by it
Wan 2.7
- + High resolution and clarity on the textures of the space suit
- + Detailed rendering of the horse's anatomy and gear
- + Well-defined terrestrial and orbital elements
- − Completely ignored the prompt's instruction for a 'horse riding astronaut' with the horse on top
- − Composition feels less 'surreal' and more like a standard stock image collage
Verdict: Both models failed the specific logic puzzle in the prompt, which requested a reversed scenario where the horse is on top of the astronaut. While DALL-E 3 captures a more 'cinematic' and 'surreal' mood as requested, Wan 2.7 provides higher clarity and realism in its individual elements; however, since both failed the primary subject reversal, DALL-E 3 is slightly preferred for its superior artistic composition.
The Capybara Taxi Driver
Text-to-Image“Photorealistic scene inside a yellow New York taxi at night. A capybara is driving, wearing a yellow taxi driver cap and a dark jacket. It has a calm, professional expression and both front paws on the steering wheel. In the back seat sits a human businesswoman in a coat, looking at her phone with a completely normal, bored expression (as if this is just another normal ride). Through the windows you can see the streets of Manhattan at night with blurred lights. Realistic taxi interior, photorealistic, detailed fur and fabric, 35mm lens, night lighting with reflections, shallow depth of field.”
AI Judge Analysis
DALL-E 3
- + Excellent interior detail with realistic dashboard lighting and textures.
- + Dynamic rim lighting on the capybara's fur creates a cinematic look.
- + Strictly adheres to the professional expression and placement of front paws.
- − Completely misses the human businesswoman sitting in the back seat.
- − The driver's hat is black instead of the requested yellow.
Wan 2.7
- + Includes all prompt elements, including the bored businesswoman looking at her phone.
- + Correctly features a yellow driver cap as requested.
- + The capybara's paws are well-rendered for a complex pose.
- − The businesswoman is sitting in the front passenger seat instead of the back seat.
- − The perspective is from outside the window rather than the inside scene requested.
- − The lighting lacks the photorealistic depth seen in the competitor.
Verdict: Wan 2.7 is the winner because it successfully includes the human passenger and the yellow cap, which DALL-E 3 failed to generate. While DALL-E 3 has superior lighting and interior textures, it missed a core narrative component of the prompt by omitting the businesswoman entirely.
The Halloween Invitation
Text-to-Image“Vintage gothic Halloween party invitation. Dark parchment poster, spooky border with webs and thorns, central glowing jack-o-lantern, bats, twisted trees, moody night sky. Add elegant gothic title text saying "Halloween Party Invitation", a small scroll banner saying "You are invited to a night of frights", and event details at the bottom: Date: 30.10.2026 Time: 7pm Location: The Arches, NYC Spooky but polished, cinematic lighting, square format.”
AI Judge Analysis
DALL-E 3
- + Atmospheric cinematic lighting and depth
- + Highly detailed 3D-textured border design
- − Text is largely illegible and garbled
- − Fails to include specific event details like the venue name
Wan 2.7
- + Perfect adherence to all requested text and event details
- + Clean, balanced layout following the parchment poster style
- − Illustrative style is less 'cinematic' and more cartoonish
- − The sky and tree imagery is repetitive
Verdict: Wan 2.7 is the clear winner for this task because it successfully renders all the specific text details including the date, time, and location accurately. While DALL-E 3 captures a more moody and gothic atmosphere with impressive textures, its failure to generate legible text makes it non-functional as an invitation.
Explore each model
Alibaba's Wan 2.7 image generation and editing model for text-to-image, reference-guided generation, and instruction-based image edits