OpenAI's legacy image generation model supporting generations, edits with masks (inpainting), and variations
Settled by community votes across 4 shared challenges, with an AI judge weighing in on each.
DALL-E 2
#37 of 44 in Text-to-Image
Not enough comparable category data
The chart appears once both models have ratings across at least three shared arena categories.
Wan 2.7
#34 of 44 in Text-to-Image
Where the votes landed
DALL-E 2
0.0%
win rate
Ties
100.0%
Wan 2.7
0.0%
win rate
Challenge by challenge
The strongest take from each model on every shared challenge, with the AI judge's read.
Chalkboard Menu
Text-to-Image“Handwritten-style chalkboard menu in a cozy café, all text rendered in the exact same realistic chalk handwriting style with natural variations in letter size, slight slant, and chalk texture — no printed or digital fonts anywhere on the board. Title at the top in elegant cursive chalk handwriting: ‘TODAY’S SPECIALS – APRIL 30, 2026’. Below it, three menu items also in the same handwritten chalk style: ‘Truffle Mushroom Risotto – $24’, ‘Grilled Octopus with Lemon & Herbs – $28’, ‘Brown Butter Chocolate Chip Cookies – $9’. At the very bottom, smaller text in the identical handwritten chalk style (slightly smaller but still clearly legible with the same handwriting characteristics): ‘All items made fresh daily • Ask about our gluten-free options’. Warm ambient café lighting, visible chalk dust and smudges, realistic handwriting imperfections, no clean printed text anywhere.”
AI Judge Analysis
DALL-E 2
- + Attempts a realistic hand-drawn chalk texture
- + Captured the vibe of a messy, authentic chalkboard
- − Text is completely illegible and gibberish
- − Fails to include specific requested menu items or dates
- − Poor overall image resolution and clarity
Wan 2.7
- + Perfect text rendering of the exact requested prompt
- + High-quality visual composition with a cozy café background
- + Consistent, elegant handwriting style throughout the board
- − The text looks a bit too perfect and digital rather than truly hand-written with chalk
- − Letters appear to float slightly above the surface due to drop shadows
Verdict: Wan 2.7 followed every instruction in the prompt with near-perfect accuracy, including specific dates and complex menu items. DALL-E 2 failed to produce legible text or follow the specific content requirements, resulting in an unusable image. While Wan 2.7's text looks more like a digital font than natural chalk, its overall quality and prompt adherence are vastly superior.
The Reversed Rodeo
Text-to-Image“Horse riding astronaut in space — horse on top, not vice versa. Surreal, highly detailed, cinematic.”
AI Judge Analysis
DALL-E 2
- + Attempts a gritty, textured cinematic style
- + The proportions of the astronaut and horse are balanced within the frame
- − Fails the specific prompt instruction to have the horse on top
- − Low visual fidelity with muddy textures and poor anatomical clarity
- − The astronaut's suit and the horse's head lack detail
Wan 2.7
- + High resolution and crisp visual clarity
- + Strong cinematic composition with vibrant celestial backgrounds
- + Accurate rendering of textures like the astronaut's suit and the horse's coat
- − Fails the specific prompt instruction to have the horse on top
- − The horse's front hoof has a mild anatomical distortion (floating shape)
Verdict: Both models failed the specific logic-defying constraint of 'horse on top' (horse riding the astronaut), instead providing the standard astronaut-on-horse interpretation. However, Wan 2.7 is the clear winner due to its significantly higher image quality, detailed textures, and sophisticated background compared to the grainy and low-detail output from DALL-E 2.
The Capybara Taxi Driver
Text-to-Image“Photorealistic scene inside a yellow New York taxi at night. A capybara is driving, wearing a yellow taxi driver cap and a dark jacket. It has a calm, professional expression and both front paws on the steering wheel. In the back seat sits a human businesswoman in a coat, looking at her phone with a completely normal, bored expression (as if this is just another normal ride). Through the windows you can see the streets of Manhattan at night with blurred lights. Realistic taxi interior, photorealistic, detailed fur and fabric, 35mm lens, night lighting with reflections, shallow depth of field.”
AI Judge Analysis
DALL-E 2
- + The image texture appears somewhat leathery.
- − Completely failed the prompt by generating a closeup of a black bag instead of the requested scene.
- − Missing all characters and environmental elements requested.
- − Low visual complexity compared to the prompt requirement.
Wan 2.7
- + Excellent adherence to all complex prompt elements including characters and location.
- + High visual quality with realistic lighting and fur textures.
- + Captures the requested mood with the bored expression of the passenger and the professional stance of the capybara.
- − The paws on the steering wheel look slightly anthropomorphized and claw-like.
- − The passenger appears to be sitting in the front passenger seat rather than the back seat as specified.
Verdict: DALL-E 2 suffered a total failure, producing a simple black bag that has nothing to do with the prompt. Wan 2.7 successfully executed a complex, cinematic scene with high fidelity, missing only the minor detail of character placement in the back seat.
The Halloween Invitation
Text-to-Image“Vintage gothic Halloween party invitation. Dark parchment poster, spooky border with webs and thorns, central glowing jack-o-lantern, bats, twisted trees, moody night sky. Add elegant gothic title text saying "Halloween Party Invitation", a small scroll banner saying "You are invited to a night of frights", and event details at the bottom: Date: 30.10.2026 Time: 7pm Location: The Arches, NYC Spooky but polished, cinematic lighting, square format.”
AI Judge Analysis
DALL-E 2
- + Features a hand-painted, expressive art style.
- + Captures the 'twisted trees' aspect of the prompt well.
- − Text is largely illegible and fails to follow the specific prompt requirements.
- − Lacks the requested 'polished' look, appearing more like a rough sketch.
- − Missing most secondary details like 'The Arches, NYC' or 'Date: 30.10.2026'.
Wan 2.7
- + Excellent text rendering, following every specific detail and date perfectly.
- + Clean, professional composition with clear borders, webs, and thorns.
- + High resolution with vivid colors and cinematic lighting on the central jack-o-lantern.
- − The art style is more like a modern digital illustration than 'vintage' gothic parchment.
- − The 'moody night sky' is a bit bright and clean, lacking some grit.
Verdict: Wan 2.7 is the clear winner as it followed every instruction, including specific dates, locations, and phrases, which DALL-E 2 failed to render legibly. While Wan 2.7 has a more 'organized' look than typical vintage parchment, its adherence to the complex text requirements and high visual quality makes it far superior for an invitation.
Explore each model
Alibaba's Wan 2.7 image generation and editing model for text-to-image, reference-guided generation, and instruction-based image edits