GPT Image 1.5 vs Stable Diffusion 3.5 Large
Head-to-head across 10 challenges
GPT Image 1.5
75.0%
win rate
Ties
0.0%
Stable Diffusion 3.5 Large
25.0%
win rate
Challenge Results
Geometric Composition
Text-to-Image“A glass cube on a wooden table. Inside the cube is a small blue sphere. On top of the cube sits a red book. A green plant is behind the cube, partially visible through the glass. Soft window light from the left.”
AI Judge Analysis
GPT Image 1.5
- + Perfect adherence to the spatial arrangement requested.
- + Highly realistic glass reflections and refractions of the background plant.
- + Excellent material textures, especially on the book's canvas cover and the wooden table.
- − The blue sphere is relatively large compared to the prompt's 'small blue sphere'.
Stable Diffusion 3.5 Large
- + Good lighting effects and sharp focus on the glass cube.
- + Accurate 'small' scale for the blue sphere.
- − Incorrect object placement; the book is inside the cube rather than sitting on top of it.
- − Coherence issues where the glass cube seems to clip through the red book.
- − The plant is mostly to the side/front rather than behind the cube as requested.
Verdict: GPT Image 1.5 followed the complex spatial instructions perfectly, correctly placing the book on top of the cube and the plant behind it. Stable Diffusion 3.5 Large struggled with the spatial logic, placing the book inside the cube and failing to clearly place the plant behind the glass, which was a core element of the prompt.
Candid Street Photography
Text-to-Image“A candid street photo of an elderly Japanese man repairing a red bicycle in light rain, reflections on wet pavement, shallow depth of field, 50mm lens, natural skin texture, imperfect framing, motion blur from passing cars, cinematic but realistic, no stylization.”
AI Judge Analysis
GPT Image 1.5
- + Excellent skin and fabric textures that look realistic.
- + Strong adherence to 'imperfect framing' with a tight, candid composition.
- + Detailed mechanical parts on the bicycle and repair tools.
- − The car in the background lacks the requested motion blur.
- − The raindrops appear as static white dots rather than falling streaks.
Stable Diffusion 3.5 Large
- + Successfully captured motion blur on the background vehicles.
- + Good representation of falling rain and wet pavement reflections.
- + Accurate adherence to 'shallow depth of field' with a soft background.
- − Anatomical issues with the man's hands and arms which appear distorted.
- − The bicycle's structure is physically impossible with floating and merging parts.
- − The man appears slightly 'pasted' into the scene with mismatched lighting.
Verdict: GPT Image 1.5 is the superior image due to its high level of photorealism, particularly in the subject's skin texture and the mechanical detail of the bike. While Stable Diffusion 3.5 Large followed the 'motion blur' prompt better, it failed significantly on structural coherence, producing mangled hands and an impossible bicycle frame.
Fantasy Warrior
Text-to-Image“Close portrait of a battle-worn paladin in ornate engraved plate armor, hair braided with small beads, faint scars and dirt on the skin, warm torchlight reflecting off metal, shallow depth of field, bokeh sparks, lifelike eyes, highly detailed texture on leather straps and cloth underlayer.”
AI Judge Analysis
GPT Image 1.5
- + Excellent adherence to all prompt details including beads in hair and leather straps.
- + Superior lighting effects with warm torchlight and realistic bokeh sparks.
- + Highly detailed facial texture with convincing scars and dirt.
- − The composition is a bit tight on the forehead.
Stable Diffusion 3.5 Large
- + Beautifully detailed ornate engraving on the plate armor.
- + Strong character expression and clear facial features.
- + Good interpretation of the braided hair requirement.
- − Missed the 'small beads' in the hair mentioned in the prompt.
- − The lighting feels more like daylight than the requested warm torchlight.
- − Lacks the specific bokeh sparks requested.
Verdict: GPT Image 1.5 is the clear winner as it followed every specific detail of the prompt, including the beads in the hair, the leather straps, and the specific warm torchlight atmosphere with bokeh sparks. Stable Diffusion 3.5 Large produced a high-quality image with impressive armor engraving, but it failed to include the beads and the lighting felt too cool and diffused compared to the torchlight requested.
Modern Clean Menu
Text-to-Image“Modern minimalist restaurant menu design, white background with colorful food photos in grid, sections for appetizers/pizza/mains, bold sans-serif fonts, vibrant accents, clean professional layout for casual dining.”
AI judge analyzing...
Adorable Baby Animals in Sunny Meadow
Text-to-Image“Hyper-photorealistic scene of fluffy baby animals—a golden retriever puppy, tabby kitten, baby bunny, and red fox kit—with big expressive eyes and ultra-detailed soft fur, playfully chasing butterflies and tumbling together in a lush wildflower meadow, warm golden sunrise light with god rays and dew sparkles, joyful wholesome vibe, 8K masterpiece.”
AI judge analyzing...
Victorian Greenhouse Oasis
Text-to-Image“Hyper-photorealistic interior of a lush Victorian glass greenhouse filled with exotic tropical plants, vibrant blooming orchids, tall ferns, colorful butterflies in flight, sunlight filtering through ornate glass roof creating realistic caustics and dew on leaves, intricate iron framework visible, misty atmosphere, 8K masterpiece.”
AI judge analyzing...
Heroic Super Hero Portrait
Text-to-Image“Hyper-photorealistic full-body portrait of a female superhero standing triumphantly on a New York skyscraper rooftop at golden sunset, wearing a classic modest superhero costume with flowing cape, chest emblem, gloves, and boots in red and blue colors, practical design, short hair, strong determined heroic expression looking into the distance, powerful confident stance with hands on hips and cape billowing dramatically in the wind, detailed urban cityscape background, warm natural sunlight with sharp shadows and fabric highlights, ultra-sharp textures on suit, hair, and concrete, 8K masterpiece, empowering family-friendly style.”
AI judge analyzing...
Intricate Floral Mandala
Text-to-Image“Perfectly symmetrical mandala made entirely of real flowers, petals, leaves, fruits, and seeds in vibrant natural colors, intricate layered patterns with radial symmetry, top-down view on a soft neutral background, hyper-detailed organic textures and subtle shadows, photorealistic, 8K masterpiece.”
AI judge analyzing...
Vintage Cafe Logo
Text-to-Image“Vintage minimalist restaurant logo for "Caffè Florian", retro cloche dome with steam and "Est. 1720" banner, classic typography, warm brown and cream tones, subtle texture on light background, vector emblem style.”
AI judge analyzing...
Apollo 11: Journey to Tranquility
Text-to-Image“Create a clean, modern vector infographic poster about the Apollo 11 mission. NASA-inspired palette (navy, white, muted red, light gray). Flat-vector style, crisp lines, consistent iconography, subtle gradients only. Steps (stop at landing): 1. Launch (Saturn Vicon) 2. Earth Orbit (Earth + orbit ring icon) 3. Translunar (trajectory arc icon) 4. Lunar Orbit (Moon + orbit ring icon) 5. Descent (lunar module descending icon) 6. Landing (lunar module on the surface icon) Small supporting elements (minimal text): • Crew strip: three silhouette icons with only last names: Armstrong, Aldrin, Collins. • Landing site marker: Moon pin labeled "Tranquility" only. Layout constraints: generous margins, large readable labels, clean background with subtle stars. Vector-only, print-poster look, high resolution.”
AI judge analyzing...
GPT Image 1.5
OpenAI's state-of-the-art image generation model with better instruction following and adherence to prompts
Stable Diffusion 3.5 Large
Stability AI's 8.1-billion parameter Multimodal Diffusion Transformer (MMDiT) text-to-image model featuring improved image quality, typography, complex prompt understanding, and resource-efficiency