OpenAI's legacy image generation model supporting generations, edits with masks (inpainting), and variations
Settled by community votes across 8 shared challenges, with an AI judge weighing in on each.
DALL-E 2
#37 of 44 in Text-to-Image
Not enough comparable category data
The chart appears once both models have ratings across at least three shared arena categories.
Qwen Image 2512
#26 of 44 in Text-to-Image
Where the votes landed
DALL-E 2
0.0%
win rate
Ties
0.0%
Qwen Image 2512
100.0%
win rate
Challenge by challenge
The strongest take from each model on every shared challenge, with the AI judge's read.
Geometric Composition
Text-to-Image“A glass cube on a wooden table. Inside the cube is a small blue sphere. On top of the cube sits a red book. A green plant is behind the cube, partially visible through the glass. Soft window light from the left.”
AI Judge Analysis
DALL-E 2
- + Matches the lighting mood well
- + Clean surface reflections
- − Failed to include the red book on top
- − Sphere inside appears as a red rectangle
- − Plant is represented as a large blue pot
Qwen Image 2512
- + Follows all spatial instructions accurately
- + High visual clarity and realistic textures
- + Excellent handling of transparency and reflections
- − Glass cube has an unnatural greenish tint
Verdict: Qwen Image 2512 followed every part of the prompt, including the red book and the sphere's correct color. DALL-E 2 struggled significantly with object placement, colors, and the relationship between the items.
Candid Street Photography
Text-to-Image“A candid street photo of an elderly Japanese man repairing a red bicycle in light rain, reflections on wet pavement, shallow depth of field, 50mm lens, natural skin texture, imperfect framing, motion blur from passing cars, cinematic but realistic, no stylization.”
AI Judge Analysis
DALL-E 2
- + Successfully captures reflections on wet pavement
- + Follows the 'imperfect framing' instruction well
- − Extreme blur obscures all character details and skin texture
- − Subject is unrecognizable as an elderly Japanese man
- − Low resolution and poor clarity
Qwen Image 2512
- + High visual quality with realistic skin textures and facial details
- + Accurately represents an elderly Japanese man and a red bicycle
- + Good cinematic lighting and wet street atmosphere
- − Subject is posing rather than 'repairing' the bicycle as requested
- − Missing prominent motion blur from passing cars
- − The bicycle has some structural inconsistencies in the frame
Verdict: DALL-E 2 produced a very blurry and incoherent image that failed to show the primary subject clearly, despite following the 'imperfect framing' prompt. Qwen Image 2512 followed much more of the prompt with high-quality textures and a clear subject, though it opted for a portrait-style pose rather than an active repair scene.
Modern Clean Menu
Text-to-Image“Modern minimalist restaurant menu design, white background with colorful food photos in grid, sections for appetizers/pizza/mains, bold sans-serif fonts, vibrant accents, clean professional layout for casual dining.”
AI Judge Analysis
DALL-E 2
- + Bold, clean aesthetic that feels contemporary
- + High-contrast legible lettering for the larger headers
- − Fails to follow the grid layout requirement for food photos
- − The food images are abstract fragments rather than recognizable food items
- − Lacks the requested sections for appetizers, pizza, and mains
Qwen Image 2512
- + Excellent adherence to the grid layout and sections
- + Includes realistic, colorful food photography as requested
- + Detailed menu structure with prices and vibrantly colored icons/accents
- − Text contains several spelling errors and garbled characters
- − Some visual artifacts in the text rendering
Verdict: Qwen Image 2512 is the clear winner as it followed every part of the prompt, including a grid layout, specific food sections, and realistic food photography. DALL-E 2 produced a generic, abstract design that completely ignored the structure and content requirements of a restaurant menu.
Chalkboard Menu
Text-to-Image“Handwritten-style chalkboard menu in a cozy café, all text rendered in the exact same realistic chalk handwriting style with natural variations in letter size, slight slant, and chalk texture — no printed or digital fonts anywhere on the board. Title at the top in elegant cursive chalk handwriting: ‘TODAY’S SPECIALS – APRIL 30, 2026’. Below it, three menu items also in the same handwritten chalk style: ‘Truffle Mushroom Risotto – $24’, ‘Grilled Octopus with Lemon & Herbs – $28’, ‘Brown Butter Chocolate Chip Cookies – $9’. At the very bottom, smaller text in the identical handwritten chalk style (slightly smaller but still clearly legible with the same handwriting characteristics): ‘All items made fresh daily • Ask about our gluten-free options’. Warm ambient café lighting, visible chalk dust and smudges, realistic handwriting imperfections, no clean printed text anywhere.”
AI Judge Analysis
DALL-E 2
- + Captures a messy, authentic chalk-like texture.
- − Text is completely illegible gibberish.
- − Prompt adherence is very low with missing menu items and date.
- − Image quality is low resolution and blurry.
Qwen Image 2512
- + Perfect text rendering with high legibility.
- + Followed all complex prompt instructions including date and specific menu items.
- + Excellent visual quality with realistic chalkboard smudges and background bokeh.
- − Minor spelling error in 'Risitto' (Risotto).
- − Handwriting looks slightly more like a digital script font than raw chalk in some strokes.
Verdict: Qwen Image 2512 performed exceptionally well, accurately rendering almost all the specific text requested in the prompt with high clarity and a professional aesthetic. DALL-E 2 failed significantly, producing scrambled letters that do not form words and ignoring the specific content requirements.
The Capybara Taxi Driver
Text-to-Image“Photorealistic scene inside a yellow New York taxi at night. A capybara is driving, wearing a yellow taxi driver cap and a dark jacket. It has a calm, professional expression and both front paws on the steering wheel. In the back seat sits a human businesswoman in a coat, looking at her phone with a completely normal, bored expression (as if this is just another normal ride). Through the windows you can see the streets of Manhattan at night with blurred lights. Realistic taxi interior, photorealistic, detailed fur and fabric, 35mm lens, night lighting with reflections, shallow depth of field.”
AI Judge Analysis
DALL-E 2
- − Completely ignored the prompt.
- − Generated an image of a black handbag instead of a taxi scene.
- − Low visual quality with blurred and incoherent details.
Qwen Image 2512
- + Excellent adherence to all complex prompt requirements including the capybara's outfit and the woman's expression.
- + High-quality photographic realism with convincing lighting and depth of field.
- + Great composition that captures the surreal nature of the scene effortlessly.
- − The capybara's front paws look slightly more like human hands/fingers than rodent paws.
- − The taxi interior layout is a bit compressed vertically.
Verdict: DALL-E 2 suffered a total failure, producing a low-quality image of a handbag that had nothing to do with the prompt. Qwen Image 2512, on the other hand, executed the complex prompt perfectly, including the specific clothing, characters, and the requested 'bored' atmosphere.
Isometric Miniature Diorama Scenes
Text-to-Image“Create a clear, 45° top-down isometric miniature 3D cartoon scene of Japan's signature dish: sushi, with soft refined textures, realistic PBR materials, gentle lighting, on a small raised diorama base with minimal garnish and plate. Solid light blue background. At top-center: 'JAPAN' in large bold text, 'SUSHI' below it, small flag icon. Perfectly centered, ultra-clean, high-clarity, square format.”
AI Judge Analysis
DALL-E 2
- + Follows the solid light blue background requirement
- + Maintains a minimal aesthetic
- − Failed significantly on text rendering, misspelling 'SUSHI' as 'Sush' and omitting 'JAPAN'
- − Lacks detail and realistic PBR materials
- − Composition is sparse and lacks the requested flag icon
Qwen Image 2512
- + Excellent adherence to all complex prompt instructions including text and flag icon
- + High-quality 3D miniature diorama aesthetic with clear isometric perspective
- + Clean and accurate text rendering for both 'JAPAN' and 'SUSHI'
- − None notable
Verdict: Qwen Image 2512 perfectly captured every element of the prompt, including the specific text, the isometric diorama style, and the flag icon. DALL-E 2 failed on text accuracy, missing components, and overall visual quality, producing a much more primitive result.
Adorable Baby Animals in Sunny Meadow
Text-to-Image“Hyper-photorealistic scene of fluffy baby animals—a golden retriever puppy, tabby kitten, baby bunny, and red fox kit—with big expressive eyes and ultra-detailed soft fur, playfully chasing butterflies and tumbling together in a lush wildflower meadow, warm golden sunrise light with god rays and dew sparkles, joyful wholesome vibe, 8K masterpiece.”
AI Judge Analysis
DALL-E 2
- + Natural feeling of motion and action
- + Effective use of saturated warm lighting
- − Severe anatomical distortions and artifacts
- − Poor rendering of butterflies and background details
Qwen Image 2512
- + Excellent anatomical detail and fur texture
- + Accurate inclusion of all requested animal species
- + Clear depiction of god rays and butterflies
- − Composition feels more staged than a natural scene
- − Symmetry is slightly artificial
Verdict: Qwen Image 2512 far exceeds DALL-E 2 in technical execution, providing high-resolution textures and accurate animal features that align with the 8K masterpiece request. While DALL-E 2 attempts a more dynamic action shot, the resulting image is marred by significant visual artifacts and unclear subjects.
Vintage Cafe Logo
Text-to-Image“Vintage minimalist restaurant logo for "Caffè Florian", retro cloche dome with steam and "Est. 1720" banner, classic typography, warm brown and cream tones, subtle texture on light background, vector emblem style.”
AI Judge Analysis
DALL-E 2
- + Matches the minimalist and warm brown tone request
- + Centralized cloche icon is clearly identifiable as a vector emblem
- − Text is nonsensical and fails to render 'Caffè Florian' or 'Est. 1720'
- − Overall composition looks unfinished and fragmented
Qwen Image 2512
- + Perfect text rendering of 'Caffè Florian' and 'Est. 1720' on a banner
- + Excellent vintage illustrative style with detailed steam and paper texture
- + Cohesive composition that creates a professional-looking restaurant logo
- − Slightly less 'minimalist' than the prompt requested due to high level of detail
Verdict: Qwen Image 2512 followed the prompt's instructions perfectly, including accurate text rendering and a well-designed banner element. DALL-E 2 failed to produce legible text and provided a much simpler, less professional layout that lacked the requested steam and banner details.
Explore each model
Improved version of Alibaba's Qwen image model with better text rendering, finer natural textures, and more realistic human generation.