Head to head
Esc

Models · slot A

to navigate to pick

DALL-E 2 OpenAI Stable Diffusion 3.5 Large Stability AI

Settled by community votes across 8 shared challenges, with an AI judge weighing in on each.

DALL-E 2

17.7 arena score

#37 of 44 in Text-to-Image

Skill signature

Not enough comparable category data

The chart appears once both models have ratings across at least three shared arena categories.

Stable Diffusion 3.5 Large

22.9 arena score

#25 of 44 in Text-to-Image

Vote tally

Where the votes landed

DALL-E 2

0.0%

win rate

Ties

0.0%

Stable Diffusion 3.5 Large

100.0%

win rate

0.0% 0.0% ties 100.0%
Shared challenges 8

Challenge by challenge

The strongest take from each model on every shared challenge, with the AI judge's read.

Geometric Composition

Text-to-Image

“A glass cube on a wooden table. Inside the cube is a small blue sphere. On top of the cube sits a red book. A green plant is behind the cube, partially visible through the glass. Soft window light from the left.”

DALL-E 2
Stable Diffusion 3.5 Large

AI Judge Analysis

DALL-E 2

  • + Features a wooden table surface with realistic reflections.
  • + Correct soft window lighting.
  • Failed almost all spatial instructions, including the blue sphere and red book placement.
  • The plant is in a blue pot rather than behind the cube.
  • The cube appears solid with a red center rather than empty with a sphere inside.

Stable Diffusion 3.5 Large

  • + Excellent adherence to spatial prompts, placing all items in the correct relative positions.
  • + High visual clarity and realistic glass refractions.
  • + Accurately depicts the plant behind the cube visible through the glass.
  • The sphere appears to be inside the cube but floating/sitting on the book, which was positioned 'on top' of the cube in the prompt.
  • The 'red book' is inside the cube rather than on top of it.

Verdict: Stable Diffusion 3.5 Large followed the complex spatial instructions much more effectively than DALL-E 2, which failed to render the blue sphere or the red book correctly. While Stable Diffusion 3.5 Large swapped the vertical order of the book and the cube's interior, it successfully captured all requested elements and lighting effects in a high-fidelity image.

Candid Street Photography

Text-to-Image

“A candid street photo of an elderly Japanese man repairing a red bicycle in light rain, reflections on wet pavement, shallow depth of field, 50mm lens, natural skin texture, imperfect framing, motion blur from passing cars, cinematic but realistic, no stylization.”

DALL-E 2
Stable Diffusion 3.5 Large

AI Judge Analysis

DALL-E 2

  • + Captured a unique, voyeuristic 'imperfect framing' angle
  • + Effective wet pavement reflections
  • Extremely poor resolution and clarity
  • Failed to show the subject's face or ethnicity as requested
  • Depth of field is so shallow it obscures almost all relevant detail

Stable Diffusion 3.5 Large

  • + Excellent adherence to all prompt elements including age, ethnicity, and garment texture
  • + Beautiful rendering of rain, wet surfaces, and bokeh
  • + High visual quality with realistic skin and environmental details
  • Missed the 'motion blur from passing cars' request as vehicles appear static
  • The background bus has some minor structural incoherence

Verdict: Stable Diffusion 3.5 Large successfully interprets nearly all aspects of the complex prompt, delivering a cinematic and high-resolution image with realistic textures and lighting. In contrast, DALL-E 2 produced a low-quality, blurry output that failed to clearly depict the primary subject or many of the requested details. While Stable Diffusion 3.5 Large missed the motion blur for the background traffic, its overall composition and technical execution are far superior.

Fantasy Warrior

Text-to-Image

“Close portrait of a battle-worn paladin in ornate engraved plate armor, hair braided with small beads, faint scars and dirt on the skin, warm torchlight reflecting off metal, shallow depth of field, bokeh sparks, lifelike eyes, highly detailed texture on leather straps and cloth underlayer.”

DALL-E 2
Stable Diffusion 3.5 Large

AI Judge Analysis

DALL-E 2

  • + Strong bokeh effect
  • + Gritty, battle-worn texture
  • Lack of clear facial features and eyes
  • Anatomically confusing structures

Stable Diffusion 3.5 Large

  • + Highly detailed engraved armor and textures
  • + Excellent adherence to braided hair and scars
  • + Clear, lifelike eyes and facial structure
  • Less of a 'close' portrait than requested
  • Braids lack the specific beads mentioned

Verdict: Stable Diffusion 3.5 Large successfully captures nearly all prompt elements including fine armor engravings, lifelike eyes, and subtle dirt. DALL-E 2 produced a highly abstract and distorted image that failed to render a recognizable paladin or the requested details clearly. Stable Diffusion 3.5 Large is the clear preference for its technical execution and adherence.

Modern Clean Menu

Text-to-Image

“Modern minimalist restaurant menu design, white background with colorful food photos in grid, sections for appetizers/pizza/mains, bold sans-serif fonts, vibrant accents, clean professional layout for casual dining.”

DALL-E 2
Stable Diffusion 3.5 Large
0% wins 0% ties 100% wins

AI Judge Analysis

DALL-E 2

  • + Strong bold sans-serif typography on the right side.
  • + Good color vibrancy in the food-inspired mosaics.
  • The food photos are fragmented into abstract shards rather than a clear menu grid.
  • The text is completely nonsensical and gibberish.
  • Layout does not resemble a functional restaurant menu.

Stable Diffusion 3.5 Large

  • + Excellent adherence to the menu layout with clear sections for Appetizers and Mains.
  • + High-quality, distinct food photos arranged in a professional grid.
  • + Text is largely readable and follows the bold sans-serif request.
  • Slight spelling artifacts in category headers like 'Appetizrs'.
  • The grid layout feels slightly repetitive with multiple similar pizza images.

Verdict: Stable Diffusion 3.5 Large far outperforms DALL-E 2 by providing a functional, professional menu design that accurately interprets all aspects of the prompt, including specific food sections and a grid layout. DALL-E 2 produced an abstract, messy composition that lacks the clarity and structural requirements of a restaurant menu.

Isometric Miniature Diorama Scenes

Text-to-Image

“Create a clear, 45° top-down isometric miniature 3D cartoon scene of Japan's signature dish: sushi, with soft refined textures, realistic PBR materials, gentle lighting, on a small raised diorama base with minimal garnish and plate. Solid light blue background. At top-center: 'JAPAN' in large bold text, 'SUSHI' below it, small flag icon. Perfectly centered, ultra-clean, high-clarity, square format.”

DALL-E 2
Stable Diffusion 3.5 Large

AI Judge Analysis

DALL-E 2

  • + Clean isometric perspective
  • + Vibrant colors matching the background request
  • Incomplete and misspelled text ('Sush') placed on the plate instead of top-center
  • Failed to include 'JAPAN' text or a flag icon
  • The sushi models are extremely abstract and lack detail

Stable Diffusion 3.5 Large

  • + Perfect adherence to text requirements ('JAPAN', 'SUSHI', and flag icon)
  • + Excellent 3D miniature aesthetic with high-quality PBR textures for the rice and fish
  • + Correct isometric composition on a diorama base
  • Slightly more garnish than requested in the 'minimal' instruction
  • Text is on a sign rather than floating at the top-center of the image

Verdict: Stable Diffusion 3.5 Large followed every part of the prompt, including complex text rendering and specific scene elements like the diorama base. DALL-E 2 failed significantly on the text, spelling 'Sush' and omitting 'JAPAN' entirely, while also producing very low-detail models. Stable Diffusion 3.5 Large is the clear winner for its superior visual quality and prompt adherence.

Adorable Baby Animals in Sunny Meadow

Text-to-Image

“Hyper-photorealistic scene of fluffy baby animals—a golden retriever puppy, tabby kitten, baby bunny, and red fox kit—with big expressive eyes and ultra-detailed soft fur, playfully chasing butterflies and tumbling together in a lush wildflower meadow, warm golden sunrise light with god rays and dew sparkles, joyful wholesome vibe, 8K masterpiece.”

DALL-E 2
Stable Diffusion 3.5 Large

AI Judge Analysis

DALL-E 2

  • + Natural look to the golden retriever fur texture.
  • + Strong sense of dynamic motion with the animal's pose.
  • Severely malformed anatomy on the cat and fox in the background.
  • Major artifacts in the butterfly and grass rendering.
  • Lacks the requested bunny and consistent lighting quality.

Stable Diffusion 3.5 Large

  • + Includes all four requested animals with clear, distinct features.
  • + Excellent lighting including god rays, bokeh, and dew sparkles.
  • + High visual coherence with expressive eyes and soft fur textures.
  • Central puppy's paw is slightly blurred/blended into the grass.
  • The fox and cat have very similar facial structures.

Verdict: Stable Diffusion 3.5 Large successfully captures all four specified animals with high detail and beautiful atmospheric lighting that matches the 'wholesome' prompt perfectly. DALL-E 2 fails significantly on technical quality, producing nightmare-like anatomical distortions on the background animals and missing the bunny entirely.

Vintage Cafe Logo

Text-to-Image

“Vintage minimalist restaurant logo for "Caffè Florian", retro cloche dome with steam and "Est. 1720" banner, classic typography, warm brown and cream tones, subtle texture on light background, vector emblem style.”

DALL-E 2
Stable Diffusion 3.5 Large

AI Judge Analysis

DALL-E 2

  • + Simple minimalist layout
  • + Appropriate color palette
  • Text is nonsensical gibberish
  • The cloche is missing the bottom plate and steam details
  • Logo elements feel disconnected and unrefined

Stable Diffusion 3.5 Large

  • + Excellent text rendering of 'Caffé Florian' and 'Est. 1720'
  • + Successfully includes all requested elements: steam, cloche, and banner
  • + Professional vintage aesthetic with good use of texture
  • Added an extra 'e' to 'Caffé'
  • The steam inside the cloche looks a bit like flames

Verdict: Stable Diffusion 3.5 Large is the clear winner as it successfully rendered most of the requested text and adhered to every element of the prompt, including the banner and steam. DALL-E 2 failed significantly on text legibility and provided a much more basic, incomplete interpretation of the cloche icon.

Apollo 11: Journey to Tranquility

Text-to-Image

“Create a clean, modern vector infographic poster about the Apollo 11 mission. NASA-inspired palette (navy, white, muted red, light gray). Flat-vector style, crisp lines, consistent iconography, subtle gradients only. Steps (stop at landing): 1. Launch (Saturn Vicon) 2. Earth Orbit (Earth + orbit ring icon) 3. Translunar (trajectory arc icon) 4. Lunar Orbit (Moon + orbit ring icon) 5. Descent (lunar module descending icon) 6. Landing (lunar module on the surface icon) Small supporting elements (minimal text): • Crew strip: three silhouette icons with only last names: Armstrong, Aldrin, Collins. • Landing site marker: Moon pin labeled "Tranquility" only. Layout constraints: generous margins, large readable labels, clean background with subtle stars. Vector-only, print-poster look, high resolution.”

DALL-E 2
Stable Diffusion 3.5 Large

AI Judge Analysis

DALL-E 2

  • + Follows the requested color palette well with navy and muted red.
  • + Good alignment of blocky infographic text areas.
  • Nonsensical typography even for a base generation.
  • Completely fails to follow the 6-step logical sequence requested.
  • Visuals are chaotic and do not resemble an Apollo 11 mission map.

Stable Diffusion 3.5 Large

  • + Much closer to a modern vector infographic style.
  • + Successfully categorizes visual elements across the layout.
  • + Includes a detailed lunar surface and rocket iconography.
  • Uses a Space Shuttle/Buran style vehicle instead of the requested Saturn V.
  • Text is mostly gibberish despite attempting labels.
  • The layout doesn't clearly display the 6 specific chronological steps requested.

Verdict: Stable Diffusion 3.5 Large is the clear winner here as it produces a coherent, visually appealing poster that actually resembles a space infographic, whereas DALL-E 2 produces a messy, nonsensical collage. While Stable Diffusion 3.5 Large failed to accurately depict the Saturn V (showing a shuttle instead), it followed the stylistic instructions and color palette much more effectively.

Next steps

Explore each model