DALL-E 2 OpenAI GPT Image 1.5 OpenAI

Settled by community votes across 12 shared challenges, with an AI judge weighing in on each.

DALL-E 2

17.7 arena score

#37 of 44 in Text-to-Image

Skill signature

Not enough comparable category data

The chart appears once both models have ratings across at least three shared arena categories.

GPT Image 1.5

26.5 arena score

#7 of 44 in Text-to-Image

Top 3 in Image Editing

Vote tally

Where the votes landed

DALL-E 2

0.0%

win rate

Ties

0.0%

GPT Image 1.5

100.0%

win rate

0.0% 0.0% ties 100.0%

Shared challenges 12

Challenge by challenge

The strongest take from each model on every shared challenge, with the AI judge's read.

Geometric Composition

Text-to-Image

“A glass cube on a wooden table. Inside the cube is a small blue sphere. On top of the cube sits a red book. A green plant is behind the cube, partially visible through the glass. Soft window light from the left.”

DALL-E 2

GPT Image 1.5

AI Judge Analysis

DALL-E 2

+ Features a wooden table with a reflective surface.
+ Captures soft lighting correctly.

− Fails to include the red book on top of the cube.
− The blue sphere is missing, seemingly replaced by a giant blue pot in the background.
− Interpretation of 'green plant' and 'blue sphere' is confused and poorly scaled.

GPT Image 1.5

+ Perfectly adheres to every spatial instruction in the prompt.
+ High visual clarity and realistic textures for glass and cloth.
+ Excellent composition with consistent lighting from the left.

− The glass cube has a mirrored base which wasn't explicitly requested but adds to the realism.

Verdict: DALL-E 2 failed significantly on prompt adherence, missing the red book entirely and interpreting the blue sphere as a massive background planter. GPT Image 1.5 correctly rendered all objects in their specified spatial relationships with high fidelity and realistic lighting.

Candid Street Photography

Text-to-Image

“A candid street photo of an elderly Japanese man repairing a red bicycle in light rain, reflections on wet pavement, shallow depth of field, 50mm lens, natural skin texture, imperfect framing, motion blur from passing cars, cinematic but realistic, no stylization.”

DALL-E 2

GPT Image 1.5

AI Judge Analysis

DALL-E 2

+ Successfully captures reflections on wet pavement.
+ Includes visible rain effects.
+ Meets the request for imperfect framing and shallow depth of field.

− The subject is completely out of focus, losing all skin texture and facial details.
− Extremely blurry and low resolution.
− Failed to show the 'elderly Japanese man' as a recognizable subject.

GPT Image 1.5

+ Excellent adherence to all prompt elements including the elderly man, red bicycle, and rain.
+ High visual quality with natural skin textures and detailed gear.
+ Effective use of shallow depth of field and motion blur on the passing car.

− The framing is well-composed, perhaps missing the 'imperfect' instruction compared to Model A.
− Some minor geometry issues with the bicycle spokes and chain.

Verdict: DALL-E 2 produced an abstract, heavily blurred image that fails to show the primary subject requested. GPT Image 1.5 followed every detail of the prompt with high clarity, realistic lighting, and excellent thematic consistency, making it the clear winner.

Fantasy Warrior

Text-to-Image

“Close portrait of a battle-worn paladin in ornate engraved plate armor, hair braided with small beads, faint scars and dirt on the skin, warm torchlight reflecting off metal, shallow depth of field, bokeh sparks, lifelike eyes, highly detailed texture on leather straps and cloth underlayer.”

DALL-E 2

GPT Image 1.5

AI Judge Analysis

DALL-E 2

+ Features warm color tones that suggest torchlight.

− Fails to render every specific prompt detail like braided hair or lifelike eyes.
− Image is blurry and lacks high-resolution texture or clarity.
− The composition is abstract and poorly defined, appearing more like a textured statue than a portrait.

GPT Image 1.5

+ Excellent adherence to all prompt details including braided hair with beads, scars, and ornate engraving.
+ High visual quality with realistic skin texture, sharp eyes, and detailed metal surfaces.
+ Superb lighting and bokeh effects create a cinematic, battle-worn atmosphere.

− The leather strap across the chest has some minor clipping artifacts with the metal.

Verdict: DALL-E 2 produced an abstract, muddy image that failed to capture nearly any of the specific character details requested. In contrast, GPT Image 1.5 delivered a high-fidelity, photorealistic portrait that perfectly captured the lighting, braided hair, and ornate textures described in the prompt.

Modern Clean Menu

Text-to-Image

“Modern minimalist restaurant menu design, white background with colorful food photos in grid, sections for appetizers/pizza/mains, bold sans-serif fonts, vibrant accents, clean professional layout for casual dining.”

DALL-E 2

GPT Image 1.5

AI Judge Analysis

DALL-E 2

+ Bold, high-contrast typography that fits the minimalist creative aesthetic.
+ The geometric grid for photos is an interesting artistic interpretation.

− The text is nonsensical gibberish.
− The food photos are abstract, poorly defined, and do not look appetizing.
− Fails to include specific categories or a legible menu structure.

GPT Image 1.5

+ Excellent adherence to all prompt requirements including specific sections for appetizers, pizza, and mains.
+ Perfectly legible text including item names, realistic pricing, and descriptions.
+ High-quality, realistic food photography arranged in a clean grid.

− The layout is functional but standard, lacking a highly unique 'designer' flair.

Verdict: GPT Image 1.5 is the clear winner as it produced a fully functional, professional-grade menu design with perfect text legibility and high-quality food photography. In contrast, DALL-E 2 produced an abstract, unusable design with gibberish text and unrecognizable food items that failed to meet the basic requirements of the prompt.

Magic Burger Explosion: Fiery Photorealism Challenge

Text-to-Image

“Ad for 'Magic Burger'. Dynamic, exploded burger with all components (bun, patty, cheese, lettuce, tomato, sauce) suspended in mid-air. Emphasize photorealistic detail and a sense of motion. Dark, fiery background with glowing embers. Integrate text: 'MAGIC BURGER' as a prominent title, 'LIMITED TIME ONLY' as a secondary message, and '€6.99' in a starburst, all rendered with a fiery, glowing effect.”

DALL-E 2

GPT Image 1.5

AI Judge Analysis

DALL-E 2

+ Captures a strong sense of heat and fire within the composition.
+ Abstract, artistic interpretation of the 'magic' theme.

− Text is heavily garbled and misspelled.
− The burger components are unrecognizable and lack photorealistic detail.
− Fails to include the price or the starburst requested.

GPT Image 1.5

+ Excellent adherence to all text requirements including price and secondary message.
+ High-quality photorealistic textures on the lettuce, meat, and buns.
+ Dynamic composition with clear 'exploded' view and glowing fiery background.

− The '6' in the price has a slightly unusual font weight compared to the other numbers.
− A minor artifact appears where sauce drops from the top bun.

Verdict: GPT Image 1.5 significantly outperforms DALL-E 2 by following every instruction in the prompt, including complex text rendering and specific layout elements like the starburst. DALL-E 2 produced an abstract, messy image with misspelled text and unrecognizable food items, whereas GPT Image 1.5 delivered a professional-quality advertisement.

Chalkboard Menu

Text-to-Image

“Handwritten-style chalkboard menu in a cozy café, all text rendered in the exact same realistic chalk handwriting style with natural variations in letter size, slight slant, and chalk texture — no printed or digital fonts anywhere on the board. Title at the top in elegant cursive chalk handwriting: ‘TODAY’S SPECIALS – APRIL 30, 2026’. Below it, three menu items also in the same handwritten chalk style: ‘Truffle Mushroom Risotto – $24’, ‘Grilled Octopus with Lemon & Herbs – $28’, ‘Brown Butter Chocolate Chip Cookies – $9’. At the very bottom, smaller text in the identical handwritten chalk style (slightly smaller but still clearly legible with the same handwriting characteristics): ‘All items made fresh daily • Ask about our gluten-free options’. Warm ambient café lighting, visible chalk dust and smudges, realistic handwriting imperfections, no clean printed text anywhere.”

DALL-E 2

GPT Image 1.5

AI Judge Analysis

DALL-E 2

+ Captures a messy, amateur chalk aesthetic.

− Fails completely on prompt adherence for text content.
− Text is illegible gibberish.
− Low resolution and grainy image quality.

GPT Image 1.5

+ Perfect prompt adherence for all specific text and dates.
+ Very realistic chalk texture and natural handwriting variations.
+ High visual clarity and professional composition.

− The 'Brown Butter' item is completed even though the prompt was cut off (though this is usually preferred logic).

Verdict: GPT Image 1.5 is the clear winner as it followed every specific text instruction perfectly, rendering complex menu items and dates accurately in a realistic chalk style. DALL-E 2 failed to produce legible words, resulting in meaningless scribbles that did not match the prompt at all.

The Reversed Rodeo

Text-to-Image

“Horse riding astronaut in space — horse on top, not vice versa. Surreal, highly detailed, cinematic.”

DALL-E 2

GPT Image 1.5

AI Judge Analysis

DALL-E 2

+ Decent composition with a clear focal point.
+ Successfully places the subjects in a lunar-like environment.

− Failed the negative constraint; the astronaut is on top, not the horse.
− The image quality is low with significant noise and artifacts.
− The horse's anatomy is distorted, particularly the legs and tail.

GPT Image 1.5

+ Excellent visual quality and high level of detail in the space suit and horse textures.
+ Dynamic and cinematic composition with complex background elements like planets and lunar modules.
+ Strong lighting and coherent rendering of atmospheric dust.

− Failed the negative constraint; the astronaut is on top, not the horse.
− The horse has an extra leg appearing near the hind area.

Verdict: Both models failed the specific prompt instruction to place the horse on top of the astronaut, defaulting instead to the common trope of an astronaut riding a horse. However, GPT Image 1.5 is significantly superior in terms of resolution, cinematic detail, and artistic execution compared to the grainy and anatomically incorrect output from DALL-E 2.

The Capybara Taxi Driver

Text-to-Image

“Photorealistic scene inside a yellow New York taxi at night. A capybara is driving, wearing a yellow taxi driver cap and a dark jacket. It has a calm, professional expression and both front paws on the steering wheel. In the back seat sits a human businesswoman in a coat, looking at her phone with a completely normal, bored expression (as if this is just another normal ride). Through the windows you can see the streets of Manhattan at night with blurred lights. Realistic taxi interior, photorealistic, detailed fur and fabric, 35mm lens, night lighting with reflections, shallow depth of field.”

DALL-E 2

GPT Image 1.5

AI Judge Analysis

DALL-E 2

− Completely failed to follow the prompt.
− Outputted a low-quality image of a black handbag instead of a taxi scene.
− Irrelevant to the textual input provided.

GPT Image 1.5

+ Excellent adherence to all prompt details including the capybara's outfit and the passenger's expression.
+ High visual quality with realistic lighting and depth of field.
+ Accurate rendering of textures like animal fur, jacket fabric, and rain on the windshield.

− Internal car anatomy is slightly confusing regarding the seatbelt placement through the capybara's shoulder.

Verdict: DALL-E 2 suffered a total failure, producing an image of a handbag that had nothing to do with the prompt. GPT Image 1.5 performed exceptionally well, capturing the surreal request with high photorealistic quality and precise attention to the characters' expressions and settings.

The Halloween Invitation

Text-to-Image

“Vintage gothic Halloween party invitation. Dark parchment poster, spooky border with webs and thorns, central glowing jack-o-lantern, bats, twisted trees, moody night sky. Add elegant gothic title text saying "Halloween Party Invitation", a small scroll banner saying "You are invited to a night of frights", and event details at the bottom: Date: 30.10.2026 Time: 7pm Location: The Arches, NYC Spooky but polished, cinematic lighting, square format.”

DALL-E 2

GPT Image 1.5

AI Judge Analysis

DALL-E 2

+ Features a hand-painted vintage aesthetic
+ Has some stylized gothic lettering

− Text is largely illegible gibberish
− High level of visual artifacts and low resolution
− Fails to include several specific prompt requirements like the jack-o-lantern

GPT Image 1.5

+ Excellent text rendering of all requested details
+ Highly detailed and polished cinematic lighting
+ Perfect adherence to all prompt elements including border, jack-o-lantern, and scroll

− None notable

Verdict: GPT Image 1.5 followed the prompt perfectly, rendering clear and accurate text for the invitation alongside high-quality visual elements like the thorns and central jack-o-lantern. DALL-E 2 produced an abstract and blurry image with unreadable text that missed most of the specific requirements.

Adorable Baby Animals in Sunny Meadow

Text-to-Image

“Hyper-photorealistic scene of fluffy baby animals—a golden retriever puppy, tabby kitten, baby bunny, and red fox kit—with big expressive eyes and ultra-detailed soft fur, playfully chasing butterflies and tumbling together in a lush wildflower meadow, warm golden sunrise light with god rays and dew sparkles, joyful wholesome vibe, 8K masterpiece.”

DALL-E 2

GPT Image 1.5

0% wins 0% ties 100% wins

AI Judge Analysis

DALL-E 2

+ Natural motion and dynamic energy in the puppy's pose.
+ Follows the general prompt theme of animals in a meadow.

− Anatomy is significantly distorted, especially with the extra limbs and warped faces in the background.
− Major artifacts and blurry textures throughout the image.
− Missing the tabby kitten and specific requested animals in a recognizable form.

GPT Image 1.5

+ Excellent adherence to the prompt, including all four specific animals with high detail.
+ Beautiful lighting and atmospheric effects like god rays and dew sparkles.
+ High clarity and realistic fur textures.

− The composition is slightly crowded at the bottom center.
− The butterfly's wing placement on the left is a bit flat.

Verdict: DALL-E 2 fails significantly on anatomical correctness and detail, producing warped, nightmarish versions of the animals with missing elements. GPT Image 1.5, in contrast, perfectly executes the prompt with high-resolution details, all requested animals, and beautiful cinematic lighting.

Vintage Cafe Logo

Text-to-Image

“Vintage minimalist restaurant logo for "Caffè Florian", retro cloche dome with steam and "Est. 1720" banner, classic typography, warm brown and cream tones, subtle texture on light background, vector emblem style.”

DALL-E 2

GPT Image 1.5

AI Judge Analysis

DALL-E 2

+ Matches the light background and warm brown tone requested
+ Captures an abstract minimalist feel

− Text consists of nonsensical gibberish
− The 'steam' and 'cloche' are poorly defined and look like ink blotches
− Lacks the requested 'Est. 1720' banner

GPT Image 1.5

+ Excellent text rendering of 'Caffè Florian' and 'Est. 1720'
+ Clear, recognizable retro cloche dome with stylized steam
+ Professional vector-style composition and high visual quality

− Ignored the 'light background' instruction, opting for black
− Slightly less minimalist than requested due to detailed shading on the cloche

Verdict: GPT Image 1.5 is the clear winner as it accurately follows the complex text requirements and includes all primary visual elements like the cloche and banner. DALL-E 2 fails significantly on typography and coherent shape definition, resulting in a cluttered and illegible image.

Apollo 11: Journey to Tranquility

Text-to-Image

“Create a clean, modern vector infographic poster about the Apollo 11 mission. NASA-inspired palette (navy, white, muted red, light gray). Flat-vector style, crisp lines, consistent iconography, subtle gradients only. Steps (stop at landing): 1. Launch (Saturn Vicon) 2. Earth Orbit (Earth + orbit ring icon) 3. Translunar (trajectory arc icon) 4. Lunar Orbit (Moon + orbit ring icon) 5. Descent (lunar module descending icon) 6. Landing (lunar module on the surface icon) Small supporting elements (minimal text): • Crew strip: three silhouette icons with only last names: Armstrong, Aldrin, Collins. • Landing site marker: Moon pin labeled "Tranquility" only. Layout constraints: generous margins, large readable labels, clean background with subtle stars. Vector-only, print-poster look, high resolution.”

DALL-E 2

GPT Image 1.5

AI Judge Analysis

DALL-E 2

+ Captures the requested navy, white, and red NASA-inspired color palette.
+ Achieves a high-tech technical schematic aesthetic.

− Fails to follow the specific 6-step structure requested.
− Text is completely unintelligible and includes major spelling errors in the header.
− Lacks the flat-vector iconography requested, appearing more like a cluttered technical drawing.

GPT Image 1.5

+ Perfect adherence to the 6-step narrative requested in the prompt.
+ Excellent text rendering with accurate labels and astronaut names.
+ Clean, modern flat-vector style with crisp lines and a consistent iconographic feel.

− Includes green on Earth which was not in the specified 'light gray, white, navy, red' palette.
− The crop at the top is slightly tight on the astronaut silhouettes.

Verdict: GPT Image 1.5 is the clear winner as it directly follows the structural instructions of the prompt, creating a logical 6-step infographic with readable, accurate text. DALL-E 2 fails the task significantly, producing nonsensical gibberish text and a chaotic layout that ignores the requested sequence of mission steps.