DALL-E 3 vs GPT Image 1.5 — AI Image Model Comparison

Geometric Composition

Text-to-Image

“A glass cube on a wooden table. Inside the cube is a small blue sphere. On top of the cube sits a red book. A green plant is behind the cube, partially visible through the glass. Soft window light from the left.”

DALL-E 3

GPT Image 1.5

AI Judge Analysis

DALL-E 3

+ High artistic detail with life-like textures on wood and paper
+ Creative interpretation of the blue sphere containing a miniature landscape

− Failed multiple spatial instructions: the book is inside the cube instead of on top
− The cube has a wooden frame, which was not requested
− The plant is not visible through the glass

GPT Image 1.5

+ Perfect adherence to all spatial instructions: book on top, sphere inside, plant behind
+ Clean and realistic refraction and reflection details on the glass
+ Correct execution of soft lighting from the left

− Simple composition with less artistic complexity than Model A

Verdict: GPT Image 1.5 achieved perfect prompt adherence, correctly placing the red book on top of the cube and the blue sphere inside it, while also showing the plant through the glass as requested. DALL-E 3 failed the spatial logic by placing the book inside the cube and adding a wooden frame that wasn't asked for, despite its high level of textural detail.

Candid Street Photography

Text-to-Image

“A candid street photo of an elderly Japanese man repairing a red bicycle in light rain, reflections on wet pavement, shallow depth of field, 50mm lens, natural skin texture, imperfect framing, motion blur from passing cars, cinematic but realistic, no stylization.”

DALL-E 3

GPT Image 1.5

AI Judge Analysis

DALL-E 3

+ Excellent use of reflections on the wet pavement
+ Creative foreground framing that enhances the candid atmosphere
+ Successfully captures motion blur on the passing vehicle

− The man's foot has anatomical issues and appears merged with the ground
− The 'no stylization' instruction was less followed, resulting in a more digital, illustrative look
− The man is barefoot in the rain, which feels unrealistic for the context

GPT Image 1.5

+ Outstanding realism and natural skin texture
+ Perfect adherence to the 'no stylization' and 'cinematic but realistic' prompt
+ Accurate depiction of rain droplets on surfaces and clothing

− Missed the 'motion blur from passing cars' request as the background car is sharp
− The framing is more conventional and lacks the 'imperfect' feel requested

Verdict: GPT Image 1.5 wins this challenge due to its exceptional realism and adherence to the request for natural skin texture and no stylization, making it look like a genuine photograph. While DALL-E 3 captured the motion blur and reflections effectively, it suffered from anatomical glitches and a more artificial, rendered quality.

Fantasy Warrior

Text-to-Image

“Close portrait of a battle-worn paladin in ornate engraved plate armor, hair braided with small beads, faint scars and dirt on the skin, warm torchlight reflecting off metal, shallow depth of field, bokeh sparks, lifelike eyes, highly detailed texture on leather straps and cloth underlayer.”

DALL-E 3

GPT Image 1.5

AI Judge Analysis

DALL-E 3

+ Ornate and complex engraving on the armor
+ Dynamic lighting with strong highlights and bokeh sparks
+ Consistent and stylized rendering of the beads and braided hair

− Has a slightly 'plastic' or overly smoothed digital art look
− Anatomical issues with the helmet-hair integration and ear area

GPT Image 1.5

+ Exceptional lifelike skin texture with realistic dirt and scars
+ Superior rendering of materials including weathered leather and frayed cloth
+ High-fidelity hair braids with distinct, realistic beads

− The bokeh sparks are less prominent than requested
− Lower contrast in lighting compared to the reference image

Verdict: While DALL-E 3 captures a heroic, polished aesthetic with intricate armor designs, GPT Image 1.5 is the clear winner for its incredible realism. GPT Image 1.5 perfectly executes the technical details of the leather, cloth, and skin textures, feeling like a genuine photograph over DALL-E 3's more synthetic appearance.

Modern Clean Menu

Text-to-Image

“Modern minimalist restaurant menu design, white background with colorful food photos in grid, sections for appetizers/pizza/mains, bold sans-serif fonts, vibrant accents, clean professional layout for casual dining.”

DALL-E 3

GPT Image 1.5

AI Judge Analysis

DALL-E 3

+ Successfully incorporates a complex grid layout for diverse food photography.
+ Features contemporary color blocks and accents that feel high-design.
+ Captures a sophisticated minimalist aesthetic across multiple menu concepts.

− Text is largely illegible and contains gibberish characters.
− Layout feels more like a portfolio mockup than a functional, readable menu.

GPT Image 1.5

+ Excellent text rendering with clear, legible headers and item descriptions.
+ Directly adheres to the requested sections for Appetizers, Pizza, and Mains.
+ High-quality, realistic food photography that fits a casual dining context.

− The layout is relatively simple and safe compared to the 'modern minimalist' request.
− Less variety in the grid design than requested.

Verdict: GPT Image 1.5 is the clear winner because it produces a functional menu with perfectly legible English text and logically organized sections. While DALL-E 3 (Model A) offers more creative graphic design layouts, its inability to render readable text makes it useless as a menu template compared to the professional utility of GPT Image 1.5.

Magic Burger Explosion: Fiery Photorealism Challenge

Text-to-Image

“Ad for 'Magic Burger'. Dynamic, exploded burger with all components (bun, patty, cheese, lettuce, tomato, sauce) suspended in mid-air. Emphasize photorealistic detail and a sense of motion. Dark, fiery background with glowing embers. Integrate text: 'MAGIC BURGER' as a prominent title, 'LIMITED TIME ONLY' as a secondary message, and '€6.99' in a starburst, all rendered with a fiery, glowing effect.”

DALL-E 3

GPT Image 1.5

AI Judge Analysis

DALL-E 3

+ Excellent sense of motion and 'exploded' levitation
+ Clean lighting and vibrant color contrast between the cooling blues and fiery oranges
+ High-quality rendering of individual food textures like the seared patty and melting cheese

− Several spelling errors including 'MAGIC BURGR' and 'Limiited'
− The price is contained in a box rather than the requested starburst
− Abstract orange blocks floating in the air feel disconnected from the food theme

GPT Image 1.5

+ Perfect text rendering with zero spelling errors in all requested fields
+ Accurate inclusion of the 'starburst' element for the price
+ Strong fiery atmosphere with embers and smoke that matches the prompt's tone

− The 'exploded' effect is slightly less dynamic than Model A, appearing more like a tall stack
− Lighting is a bit flat and overly warm compared to the more dynamic lighting in Model A

Verdict: While DALL-E 3 captures a more cinematic and dynamic 'exploded' motion, it fails significantly on text accuracy and specific layout elements like the starburst. GPT Image 1.5 follows every instruction perfectly, including complex text and specific graphical elements, making it the superior choice for an actual advertisement.

Chalkboard Menu

Text-to-Image

“Handwritten-style chalkboard menu in a cozy café, all text rendered in the exact same realistic chalk handwriting style with natural variations in letter size, slight slant, and chalk texture — no printed or digital fonts anywhere on the board. Title at the top in elegant cursive chalk handwriting: ‘TODAY’S SPECIALS – APRIL 30, 2026’. Below it, three menu items also in the same handwritten chalk style: ‘Truffle Mushroom Risotto – $24’, ‘Grilled Octopus with Lemon & Herbs – $28’, ‘Brown Butter Chocolate Chip Cookies – $9’. At the very bottom, smaller text in the identical handwritten chalk style (slightly smaller but still clearly legible with the same handwriting characteristics): ‘All items made fresh daily • Ask about our gluten-free options’. Warm ambient café lighting, visible chalk dust and smudges, realistic handwriting imperfections, no clean printed text anywhere.”

DALL-E 3

GPT Image 1.5

AI Judge Analysis

DALL-E 3

+ Excellent artistic flourishes and chalkboard aesthetic
+ Strong lighting and atmospheric composition

− Numerous spelling errors including 'Trufle', 'Occtus', and 'Grililled'
− Inaccurate pricing and inclusion of gibberish text throughout

GPT Image 1.5

+ Perfect adherence to text and spelling requirements
+ Highly realistic chalk texture with natural handwriting variations
+ Clean and legible layout that follows the prompt exactly

− Simpler composition compared to the atmospheric lighting of Image A

Verdict: GPT Image 1.5 is the clear winner as it followed all textual instructions perfectly, including specific spelling and pricing, whereas DALL-E 3 struggled significantly with spelling and coherence. While DALL-E 3 offered a more complex visual environment, GPT Image 1.5 provided the 'exact' handwriting style and accuracy requested.

The Reversed Rodeo

Text-to-Image

“Horse riding astronaut in space — horse on top, not vice versa. Surreal, highly detailed, cinematic.”

DALL-E 3

GPT Image 1.5

AI Judge Analysis

DALL-E 3

+ Excellent ethereal and cinematic lighting
+ Creative interpretation of the horse riding on translucent clouds
+ High-quality rendering of the nebulous background

− Completely failed the negative constraint to avoid an astronaut riding a horse
− The anatomy of the horse's back legs is slightly warped

GPT Image 1.5

+ Very high level of textural detail in the space suit and horse hide
+ Complex composition with multiple celestial elements
+ Consistent lighting across the terrain and characters

− Completely failed the negative constraint; the astronaut is clearly on top of the horse
− The scale of the Saturn-like planet in the background feels slightly discordant

Verdict: Both DALL-E 3 and GPT Image 1.5 failed the specific 'horse on top' prompt instruction, instead providing the cliché image of an astronaut riding a horse. GPT Image 1.5 is the preferred output due to its significantly higher level of detail, better textural clarity, and more grounded cinematic style compared to the softer, dreamier look of DALL-E 3.

The Capybara Taxi Driver

Text-to-Image

“Photorealistic scene inside a yellow New York taxi at night. A capybara is driving, wearing a yellow taxi driver cap and a dark jacket. It has a calm, professional expression and both front paws on the steering wheel. In the back seat sits a human businesswoman in a coat, looking at her phone with a completely normal, bored expression (as if this is just another normal ride). Through the windows you can see the streets of Manhattan at night with blurred lights. Realistic taxi interior, photorealistic, detailed fur and fabric, 35mm lens, night lighting with reflections, shallow depth of field.”

DALL-E 3

GPT Image 1.5

AI Judge Analysis

DALL-E 3

+ Excellent texture on the capybara's fur and the taxi interior
+ High resolution with vivid, atmospheric lighting
+ Creative background detail with a 'CAPYBARA' branded shop outside

− Completely failed to include the human passenger in the back seat
− The capybara's paws are merged oddly into the steering wheel
− Composition is very tight and cuts off most of the vehicle context

GPT Image 1.5

+ Perfect adherence to all prompt elements, including the bored businesswoman passenger
+ Very realistic 'cinematic' photography style with natural bokeh
+ Both paws are clearly visible and correctly positioned on the wheel

− The capybara's hat is slightly floating and not perfectly fitted to its head
− Slightly less 'vibrant' color palette compared to Model A

Verdict: GPT Image 1.5 is the clear winner as it successfully included all requested elements, specifically the human passenger which DALL-E 3 omitted entirely. GPT Image 1.5 also achieved a much more convincing photorealistic look with better spatial composition and anatomical accuracy for the capybara's paws.

The Halloween Invitation

Text-to-Image

“Vintage gothic Halloween party invitation. Dark parchment poster, spooky border with webs and thorns, central glowing jack-o-lantern, bats, twisted trees, moody night sky. Add elegant gothic title text saying "Halloween Party Invitation", a small scroll banner saying "You are invited to a night of frights", and event details at the bottom: Date: 30.10.2026 Time: 7pm Location: The Arches, NYC Spooky but polished, cinematic lighting, square format.”

DALL-E 3

GPT Image 1.5

AI Judge Analysis

DALL-E 3

+ Ornate and complex 3D border design
+ Strong cinematic lighting with depth
+ Includes intricate thorns and twisted trees

− Majority of the requested text is garbled or illegible
− Fails to include the specific location and times correctly

GPT Image 1.5

+ Follows text instructions perfectly with high legibility
+ Accurate inclusion of date, time, and location
+ Features all requested elements including webs, thorns, and banner

− Composition feels slightly crowded with the large jack-o-lantern
− Background textures are a bit grainy compared to Model A

Verdict: GPT Image 1.5 is the clear winner as it successfully rendered all the specific text details requested in the prompt, which DALL-E 3 failed to do. While DALL-E 3 created a more complex and atmospheric frame, it produced illegible text for the event details and invitation message.

Adorable Baby Animals in Sunny Meadow

Text-to-Image

“Hyper-photorealistic scene of fluffy baby animals—a golden retriever puppy, tabby kitten, baby bunny, and red fox kit—with big expressive eyes and ultra-detailed soft fur, playfully chasing butterflies and tumbling together in a lush wildflower meadow, warm golden sunrise light with god rays and dew sparkles, joyful wholesome vibe, 8K masterpiece.”

DALL-E 3

GPT Image 1.5

AI Judge Analysis

DALL-E 3

+ Excellent depiction of god rays and sunrise lighting.
+ Charming, expressive character design that fits a wholesome theme.

− Very stylized and illustrative rather than the requested hyper-photorealistic.
− The butterflies have weird animal-like bodies and faces, which is an anatomical artifact.
− The kitten and fox look more like toys or plushies than real animals.

GPT Image 1.5

+ Much closer to a photorealistic style as requested in the prompt.
+ Contains all four specific animals with realistic fur textures and anatomy.
+ Beautiful interaction with the environment, including dew sparkles and natural-looking meadow flowers.

− The fox kit has a slightly awkward extra-wide mouth with some tooth artifacts.
− The lighting is a bit hazy and lacks the sharp rays seen in the other model.

Verdict: While DALL-E 3 creates a very charming and vibrant illustration, it fails the 'hyper-photorealistic' requirement and has strange chimeric butterfly-birds. GPT Image 1.5 successfully balances realism with the requested warmth and delivers a much more convincing set of baby animals in a natural environment.

Vintage Cafe Logo

Text-to-Image

“Vintage minimalist restaurant logo for "Caffè Florian", retro cloche dome with steam and "Est. 1720" banner, classic typography, warm brown and cream tones, subtle texture on light background, vector emblem style.”

DALL-E 3

GPT Image 1.5

AI Judge Analysis

DALL-E 3

+ Excellent execution of vector emblem style
+ Clean typography and balanced circular composition
+ Uses the requested brown and cream color palette effectively

− Failed to include the requested specific text 'Caffè Florian', using 'COFFEE HOUSE' instead
− The texture is very subtle compared to the prominence of the shapes

GPT Image 1.5

+ Perfectly followed text instructions for 'Caffè Florian'
+ Accurate inclusion of the 'Est. 1720' banner as requested
+ Strong retro cloche illustration with realistic steam

− Ignored the 'light background' instruction, providing a black background instead
− Text layout on the banner is slightly off-center and cramped

Verdict: GPT Image 1.5 is the clear winner because it correctly followed the prompt's specific text requirements for 'Caffè Florian', whereas DALL-E 3 generated a generic 'Coffee House' logo. While DALL-E 3 had a more polished vector layout and adhered to the light background instruction, the failure to render the primary brand name makes it less useful for the specific request.

Apollo 11: Journey to Tranquility

Text-to-Image

“Create a clean, modern vector infographic poster about the Apollo 11 mission. NASA-inspired palette (navy, white, muted red, light gray). Flat-vector style, crisp lines, consistent iconography, subtle gradients only. Steps (stop at landing): 1. Launch (Saturn Vicon) 2. Earth Orbit (Earth + orbit ring icon) 3. Translunar (trajectory arc icon) 4. Lunar Orbit (Moon + orbit ring icon) 5. Descent (lunar module descending icon) 6. Landing (lunar module on the surface icon) Small supporting elements (minimal text): • Crew strip: three silhouette icons with only last names: Armstrong, Aldrin, Collins. • Landing site marker: Moon pin labeled "Tranquility" only. Layout constraints: generous margins, large readable labels, clean background with subtle stars. Vector-only, print-poster look, high resolution.”

DALL-E 3

GPT Image 1.5

AI Judge Analysis

DALL-E 3

+ Excellent artistic style with a vintage space-age aesthetic.
+ Captures the NASA-inspired color palette perfectly.
+ High visual complexity and interesting balance between three poster variations.

− Fails significantly on infographic utility as the text is illegible and meaningless.
− The logic and sequence of the requested steps are cluttered and hard to follow.
− Includes several unrelated rocket designs like space shuttles.

GPT Image 1.5

+ Perfect adherence to the 6-step logical sequence requested.
+ Very clean and readable typography with legible labels for steps and crew.
+ Adheres strictly to the flat-vector style and iconography requirements.

− The composition is a bit basic and feels more like a slide deck than an 'infographic poster'.
− Slightly less creative in terms of artistic flare compared to Model A.

Verdict: DALL-E 3 (Image A) produces a beautiful, high-quality artistic interpretation that looks great as a poster but fails completely as an infographic because it is illegible and ignores the specific narrative steps. GPT Image 1.5 (Image B) followed every instruction, providing all 6 requested steps in a clear, logical, and clean format with legible text, making it much more useful for the intended purpose.

Models · slot A

DALL-E 3 OpenAI GPT Image 1.5 OpenAI

Where the votes landed

Challenge by challenge

Geometric Composition

AI Judge Analysis

Candid Street Photography

AI Judge Analysis

Fantasy Warrior

AI Judge Analysis

Modern Clean Menu

AI Judge Analysis

Magic Burger Explosion: Fiery Photorealism Challenge

AI Judge Analysis

Chalkboard Menu

AI Judge Analysis

The Reversed Rodeo

AI Judge Analysis

The Capybara Taxi Driver

AI Judge Analysis

The Halloween Invitation

AI Judge Analysis

Adorable Baby Animals in Sunny Meadow

AI Judge Analysis

Vintage Cafe Logo

AI Judge Analysis

Apollo 11: Journey to Tranquility

AI Judge Analysis

Explore each model

DALL-E 3

GPT Image 1.5