DALL-E 3 OpenAI Stable Diffusion 3.5 Medium Stability AI

Settled by community votes across 2 shared challenges, with an AI judge weighing in on each.

DALL-E 3

18.5 arena score

#35 of 44 in Text-to-Image

Skill signature

Not enough comparable category data

The chart appears once both models have ratings across at least three shared arena categories.

Stable Diffusion 3.5 Medium

15.7 arena score

#41 of 44 in Text-to-Image

Vote tally

Where the votes landed

DALL-E 3

win rate

Ties

Stable Diffusion 3.5 Medium

win rate

Shared challenges 2

Challenge by challenge

The strongest take from each model on every shared challenge, with the AI judge's read.

The Reversed Rodeo

Text-to-Image

“Horse riding astronaut in space — horse on top, not vice versa. Surreal, highly detailed, cinematic.”

DALL-E 3

Stable Diffusion 3.5 Medium

AI Judge Analysis

DALL-E 3

+ Excellent cinematic lighting and atmospheric effects
+ High visual coherence with the celestial background
+ Clearer representation of the 'space' environment using nebulae and clouds

− Failed the negative constraint; the astronaut is riding the horse instead of the horse on top

Stable Diffusion 3.5 Medium

+ Realistic texture on the astronaut suit and horse hair
+ Good composition with the planet curvature in the background

− Failed the central negative constraint; horse is on bottom
− Visible anatomy errors including five legs on the horse
− The astronaut appears to have a third leg or extra appendage dangling from the saddle

Verdict: Both DALL-E 3 and Stable Diffusion 3.5 Medium failed to follow the specific spatial instruction for the horse to be on top of the astronaut. However, DALL-E 3 produced a significantly higher quality image with better lighting and fewer anatomical errors, whereas Stable Diffusion 3.5 Medium suffered from severe physical artifacts including extra limbs on both the horse and rider.

The Halloween Invitation

Text-to-Image

“Vintage gothic Halloween party invitation. Dark parchment poster, spooky border with webs and thorns, central glowing jack-o-lantern, bats, twisted trees, moody night sky. Add elegant gothic title text saying "Halloween Party Invitation", a small scroll banner saying "You are invited to a night of frights", and event details at the bottom: Date: 30.10.2026 Time: 7pm Location: The Arches, NYC Spooky but polished, cinematic lighting, square format.”

DALL-E 3

Stable Diffusion 3.5 Medium

AI Judge Analysis

DALL-E 3

+ Exquisite framing with 3D-like depth and intricate gothic textures.
+ Superior cinematic lighting and atmosphere that perfectly matches the 'vintage gothic' prompt.
+ The main title 'HALLOWEEN INVITATION' is rendered with high stylistic consistency.

− The smaller secondary text and event details are illegible or garbled.
− The date and time are scattered and incorrect compared to the prompt requirements.

Stable Diffusion 3.5 Medium

+ Text is much closer to being legible, including specific words like 'The Aches' and 'NYC'.
+ Clear inclusion of all requested elements like twisted trees, pumpkins, and parchment in a readable layout.

− Significant spelling errors in the main title ('Halloweeen Inviloween').
− Simple, almost clip-art style compared to the sophisticated cinematic aesthetic requested.

Verdict: DALL-E 3 creates a stunningly moody and high-quality artistic piece that captures the gothic aesthetic perfectly, though it fails on the specific text details. Stable Diffusion 3.5 Medium attempts more of the literal text but fails with numerous spelling errors and a much flatter, less 'cinematic' visual style. DALL-E 3 is the preferred choice for a card base due to its superior composition and atmospheric quality.

Next steps

Explore each model

DALL-E 3

OpenAI

OpenAI's previous generation image model with higher quality than DALL-E 2 and support for larger resolutions

Vote this model in the arena

Arena profile Lumenfall catalog

Stable Diffusion 3.5 Medium

Stability AI

Stability AI's 2.5-billion parameter Multimodal Diffusion Transformer with improvements (MMDiT-X) text-to-image model optimized for consumer hardware, featuring improved image quality, typography, and complex prompt understanding

Vote this model in the arena

Arena profile Lumenfall catalog