DALL-E 2 OpenAI GPT Image 1 Mini OpenAI

Settled by community votes across 6 shared challenges, with an AI judge weighing in on each.

DALL-E 2

17.7 arena score

#37 of 44 in Text-to-Image

Skill signature

Not enough comparable category data

The chart appears once both models have ratings across at least three shared arena categories.

GPT Image 1 Mini

25.3 arena score

#12 of 44 in Text-to-Image

Vote tally

Where the votes landed

DALL-E 2

0.0%

win rate

Ties

0.0%

GPT Image 1 Mini

100.0%

win rate

0.0% 0.0% ties 100.0%

Shared challenges 6

Challenge by challenge

The strongest take from each model on every shared challenge, with the AI judge's read.

Geometric Composition

Text-to-Image

“A glass cube on a wooden table. Inside the cube is a small blue sphere. On top of the cube sits a red book. A green plant is behind the cube, partially visible through the glass. Soft window light from the left.”

DALL-E 2

GPT Image 1 Mini

AI Judge Analysis

DALL-E 2

+ Features a small glass cube with interesting light refraction.
+ Captures the wooden table texture well.

− Fails almost all prompt instructions regarding object placement and identity.
− Confuses the blue sphere with a massive blue background object.
− The red book is depicted as a red core inside the cube rather than on top.

GPT Image 1 Mini

+ Follows all prompt instructions perfectly, including spatial relationships.
+ High visual clarity and realistic lighting representing the 'soft window light' from the left.
+ Excellent material rendering of glass, paper, and wood.

− The plant is slightly out of focus, though this meets the 'partially visible' requirement.

Verdict: DALL-E 2 completely failed the spatial reasoning and object identification of the prompt, blending the colors into a single abstract object. GPT Image 1 Mini followed every instruction accurately, producing a coherent and high-quality scene that perfectly matches the requested composition.

Candid Street Photography

Text-to-Image

“A candid street photo of an elderly Japanese man repairing a red bicycle in light rain, reflections on wet pavement, shallow depth of field, 50mm lens, natural skin texture, imperfect framing, motion blur from passing cars, cinematic but realistic, no stylization.”

DALL-E 2

GPT Image 1 Mini

AI Judge Analysis

DALL-E 2

+ Matches the 'imperfect framing' requirement well
+ Captures a very realistic 50mm shallow depth of field effect
+ Good wet pavement reflections

− Subject is almost completely out of focus/blurred
− Fails to clearly show a 'Japanese man' due to focus issues
− Low resolution and grainy texture

GPT Image 1 Mini

+ Excellent adherence to the 'elderly Japanese man' subject with natural skin texture
+ Highly detailed rendering of the wet bicycle and rain drops
+ Strong cinematic composition while remaining realistic

− The bokeh in the background is static; lacks the requested 'motion blur from passing cars'
− Framing feels a bit too perfect and centered despite the prompt's request for 'imperfect framing'

Verdict: GPT Image 1 Mini is the clear winner as it provides a coherent, high-quality image that satisfies the core subject requirements (man, red bicycle, rain). While DALL-E 2 attempted to lean into the 'imperfect photography' aspects of the prompt, it resulted in a muddy, unrecognizable subject, whereas GPT Image 1 Mini delivered professional-grade cinematic realism.

Fantasy Warrior

Text-to-Image

“Close portrait of a battle-worn paladin in ornate engraved plate armor, hair braided with small beads, faint scars and dirt on the skin, warm torchlight reflecting off metal, shallow depth of field, bokeh sparks, lifelike eyes, highly detailed texture on leather straps and cloth underlayer.”

DALL-E 2

GPT Image 1 Mini

AI Judge Analysis

DALL-E 2

+ Successfully captures a tight macro-style focus.
+ Uses high-contrast lighting that conveys a gritty atmosphere.

− Severely lacks image coherence and clarity, appearing as a messy collage of textures.
− Failed to render identifiable eyes, braided hair, or recognizable armor features.
− Composition is confusing and lacks a clear focal point.

GPT Image 1 Mini

+ Excellent adherence to all prompt details including braided hair, scars, and ornate engraving.
+ High visual quality with realistic skin textures and lifelike eyes.
+ Beautiful lighting and bokeh that create a cinematic depth of field.

− The 'small beads' requested in the hair are subtle to the point of being nearly invisible.

Verdict: GPT Image 1 Mini outperformed DALL-E 2 in every metric, delivering a clear and detailed portrait that followed the complex prompt precisely. DALL-E 2 failed to produce a coherent image, resulting in a distorted mess of textures that lacked a recognizable human face or the specific requests like braided hair.

Chalkboard Menu

Text-to-Image

“Handwritten-style chalkboard menu in a cozy café, all text rendered in the exact same realistic chalk handwriting style with natural variations in letter size, slight slant, and chalk texture — no printed or digital fonts anywhere on the board. Title at the top in elegant cursive chalk handwriting: ‘TODAY’S SPECIALS – APRIL 30, 2026’. Below it, three menu items also in the same handwritten chalk style: ‘Truffle Mushroom Risotto – $24’, ‘Grilled Octopus with Lemon & Herbs – $28’, ‘Brown Butter Chocolate Chip Cookies – $9’. At the very bottom, smaller text in the identical handwritten chalk style (slightly smaller but still clearly legible with the same handwriting characteristics): ‘All items made fresh daily • Ask about our gluten-free options’. Warm ambient café lighting, visible chalk dust and smudges, realistic handwriting imperfections, no clean printed text anywhere.”

DALL-E 2

GPT Image 1 Mini

AI Judge Analysis

DALL-E 2

+ Captures a messy, authentic chalk texture

− Text is completely illegible and gibberish
− Fails to follow specific menu item instructions
− The image is low resolution and poorly framed

GPT Image 1 Mini

+ Excellent text rendering with 100% accuracy to the prompt
+ Captures a realistic chalk grain texture on the letters
+ Clean and balanced composition with a professional wooden frame

− The handwriting style is a bit too uniform, leaning towards a digital font feel despite the grain

Verdict: GPT Image 1 Mini followed the prompt perfectly, rendering the specific menu items and date with perfect spelling and legibility. In contrast, DALL-E 2 produced an illegible mess of pseudo-letters that failed every requirement of the text-to-image challenge.

The Capybara Taxi Driver

Text-to-Image

“Photorealistic scene inside a yellow New York taxi at night. A capybara is driving, wearing a yellow taxi driver cap and a dark jacket. It has a calm, professional expression and both front paws on the steering wheel. In the back seat sits a human businesswoman in a coat, looking at her phone with a completely normal, bored expression (as if this is just another normal ride). Through the windows you can see the streets of Manhattan at night with blurred lights. Realistic taxi interior, photorealistic, detailed fur and fabric, 35mm lens, night lighting with reflections, shallow depth of field.”

DALL-E 2

GPT Image 1 Mini

AI Judge Analysis

DALL-E 2

− Failed completely to follow the prompt.
− Generated an image of a black handbag instead of a taxi scene.
− Irrelevant to the user request.

GPT Image 1 Mini

+ Excellent adherence to all prompt details including the capybara, businessman, and setting.
+ High visual quality with realistic textures on the fur and leather jacket.
+ Successfully captured the lighting and atmosphere of a New York taxi at night.

− The passenger's hand holding the phone looks slightly distorted.
− The capybara has its right paw on the wheel but the left is less clearly visible.

Verdict: DALL-E 2 suffered a total failure, generating a close-up of a black handbag that has no relation to the prompt provided. GPT Image 1 Mini followed the complex prompt near-perfectly, delivering a high-quality, cinematic image that accurately captures the surreal concept with photorealistic detail.

Adorable Baby Animals in Sunny Meadow

Text-to-Image

“Hyper-photorealistic scene of fluffy baby animals—a golden retriever puppy, tabby kitten, baby bunny, and red fox kit—with big expressive eyes and ultra-detailed soft fur, playfully chasing butterflies and tumbling together in a lush wildflower meadow, warm golden sunrise light with god rays and dew sparkles, joyful wholesome vibe, 8K masterpiece.”

DALL-E 2

GPT Image 1 Mini

0% wins 0% ties 100% wins

AI Judge Analysis

DALL-E 2

+ Natural lighting on the puppy's fur
+ Good sense of movement

− Severe anatomical distortions and blurred subjects
− Failed to render distinct animal types clearly

GPT Image 1 Mini

+ Excellent clarity and adherence to all animal types
+ Superb golden hour lighting with god rays

− Somewhat repetitive poses for the animals

Verdict: GPT Image 1 Mini captured the prompt's complexity perfectly, rendering all four requested animals with high fidelity and beautiful lighting. DALL-E 2 struggled with the multiple subjects, resulting in significant artifacts and a lack of coherent detail.