GPT Image 1.5 OpenAI Stable Diffusion 3.5 Large Stability AI

Settled by community votes across 7 shared challenges, with an AI judge weighing in on each.

GPT Image 1.5

26.5 arena score

#7 of 44 in Text-to-Image

Top 3 in Image Editing

Skill signature

Not enough comparable category data

The chart appears once both models have ratings across at least three shared arena categories.

Stable Diffusion 3.5 Large

22.9 arena score

#25 of 44 in Text-to-Image

Vote tally

Where the votes landed

GPT Image 1.5

69.2%

win rate

Ties

0.0%

Stable Diffusion 3.5 Large

30.8%

win rate

69.2% 0.0% ties 30.8%

Shared challenges 7

Challenge by challenge

The strongest take from each model on every shared challenge, with the AI judge's read.

Geometric Composition

Text-to-Image

“A glass cube on a wooden table. Inside the cube is a small blue sphere. On top of the cube sits a red book. A green plant is behind the cube, partially visible through the glass. Soft window light from the left.”

GPT Image 1.5

Stable Diffusion 3.5 Large

29% wins 0% ties 71% wins

AI Judge Analysis

GPT Image 1.5

+ Perfect adherence to the spatial arrangement requested.
+ Highly realistic glass reflections and refractions of the background plant.
+ Excellent material textures, especially on the book's canvas cover and the wooden table.

− The blue sphere is relatively large compared to the prompt's 'small blue sphere'.

Stable Diffusion 3.5 Large

+ Good lighting effects and sharp focus on the glass cube.
+ Accurate 'small' scale for the blue sphere.

− Incorrect object placement; the book is inside the cube rather than sitting on top of it.
− Coherence issues where the glass cube seems to clip through the red book.
− The plant is mostly to the side/front rather than behind the cube as requested.

Verdict: GPT Image 1.5 followed the complex spatial instructions perfectly, correctly placing the book on top of the cube and the plant behind it. Stable Diffusion 3.5 Large struggled with the spatial logic, placing the book inside the cube and failing to clearly place the plant behind the glass, which was a core element of the prompt.

Candid Street Photography

Text-to-Image

“A candid street photo of an elderly Japanese man repairing a red bicycle in light rain, reflections on wet pavement, shallow depth of field, 50mm lens, natural skin texture, imperfect framing, motion blur from passing cars, cinematic but realistic, no stylization.”

GPT Image 1.5

Stable Diffusion 3.5 Large

100% wins 0% ties 0% wins

AI Judge Analysis

GPT Image 1.5

+ Excellent skin and fabric textures that look realistic.
+ Strong adherence to 'imperfect framing' with a tight, candid composition.
+ Detailed mechanical parts on the bicycle and repair tools.

− The car in the background lacks the requested motion blur.
− The raindrops appear as static white dots rather than falling streaks.

Stable Diffusion 3.5 Large

+ Successfully captured motion blur on the background vehicles.
+ Good representation of falling rain and wet pavement reflections.
+ Accurate adherence to 'shallow depth of field' with a soft background.

− Anatomical issues with the man's hands and arms which appear distorted.
− The bicycle's structure is physically impossible with floating and merging parts.
− The man appears slightly 'pasted' into the scene with mismatched lighting.

Verdict: GPT Image 1.5 is the superior image due to its high level of photorealism, particularly in the subject's skin texture and the mechanical detail of the bike. While Stable Diffusion 3.5 Large followed the 'motion blur' prompt better, it failed significantly on structural coherence, producing mangled hands and an impossible bicycle frame.

Fantasy Warrior

Text-to-Image

“Close portrait of a battle-worn paladin in ornate engraved plate armor, hair braided with small beads, faint scars and dirt on the skin, warm torchlight reflecting off metal, shallow depth of field, bokeh sparks, lifelike eyes, highly detailed texture on leather straps and cloth underlayer.”

GPT Image 1.5

Stable Diffusion 3.5 Large

100% wins 0% ties 0% wins

AI Judge Analysis

GPT Image 1.5

+ Excellent adherence to all prompt details including beads in hair and leather straps.
+ Superior lighting effects with warm torchlight and realistic bokeh sparks.
+ Highly detailed facial texture with convincing scars and dirt.

− The composition is a bit tight on the forehead.

Stable Diffusion 3.5 Large

+ Beautifully detailed ornate engraving on the plate armor.
+ Strong character expression and clear facial features.
+ Good interpretation of the braided hair requirement.

− Missed the 'small beads' in the hair mentioned in the prompt.
− The lighting feels more like daylight than the requested warm torchlight.
− Lacks the specific bokeh sparks requested.

Verdict: GPT Image 1.5 is the clear winner as it followed every specific detail of the prompt, including the beads in the hair, the leather straps, and the specific warm torchlight atmosphere with bokeh sparks. Stable Diffusion 3.5 Large produced a high-quality image with impressive armor engraving, but it failed to include the beads and the lighting felt too cool and diffused compared to the torchlight requested.

Modern Clean Menu

Text-to-Image

“Modern minimalist restaurant menu design, white background with colorful food photos in grid, sections for appetizers/pizza/mains, bold sans-serif fonts, vibrant accents, clean professional layout for casual dining.”

GPT Image 1.5

Stable Diffusion 3.5 Large

100% wins 0% ties 0% wins

AI judge analyzing...

Adorable Baby Animals in Sunny Meadow

Text-to-Image

“Hyper-photorealistic scene of fluffy baby animals—a golden retriever puppy, tabby kitten, baby bunny, and red fox kit—with big expressive eyes and ultra-detailed soft fur, playfully chasing butterflies and tumbling together in a lush wildflower meadow, warm golden sunrise light with god rays and dew sparkles, joyful wholesome vibe, 8K masterpiece.”

GPT Image 1.5

Stable Diffusion 3.5 Large

100% wins 0% ties 0% wins

AI judge analyzing...

Vintage Cafe Logo

Text-to-Image

“Vintage minimalist restaurant logo for "Caffè Florian", retro cloche dome with steam and "Est. 1720" banner, classic typography, warm brown and cream tones, subtle texture on light background, vector emblem style.”

GPT Image 1.5

Stable Diffusion 3.5 Large

100% wins 0% ties 0% wins

AI judge analyzing...

Apollo 11: Journey to Tranquility

Text-to-Image

“Create a clean, modern vector infographic poster about the Apollo 11 mission. NASA-inspired palette (navy, white, muted red, light gray). Flat-vector style, crisp lines, consistent iconography, subtle gradients only. Steps (stop at landing): 1. Launch (Saturn Vicon) 2. Earth Orbit (Earth + orbit ring icon) 3. Translunar (trajectory arc icon) 4. Lunar Orbit (Moon + orbit ring icon) 5. Descent (lunar module descending icon) 6. Landing (lunar module on the surface icon) Small supporting elements (minimal text): • Crew strip: three silhouette icons with only last names: Armstrong, Aldrin, Collins. • Landing site marker: Moon pin labeled "Tranquility" only. Layout constraints: generous margins, large readable labels, clean background with subtle stars. Vector-only, print-poster look, high resolution.”

GPT Image 1.5

Stable Diffusion 3.5 Large

25% wins 0% ties 75% wins

AI judge analyzing...

Next steps

Explore each model

GPT Image 1.5

OpenAI

OpenAI's state-of-the-art image generation model with better instruction following and adherence to prompts

Vote this model in the arena

Arena profile Lumenfall catalog

Stable Diffusion 3.5 Large

Stability AI

Stability AI's 8.1-billion parameter Multimodal Diffusion Transformer (MMDiT) text-to-image model featuring improved image quality, typography, complex prompt understanding, and resource-efficiency

Vote this model in the arena

Arena profile Lumenfall catalog