OpenAI's legacy image generation model supporting generations, edits with masks (inpainting), and variations
Settled by community votes across 6 shared challenges, with an AI judge weighing in on each.
DALL-E 2
#37 of 44 in Text-to-Image
Not enough comparable category data
The chart appears once both models have ratings across at least three shared arena categories.
GPT Image 1 Mini
#12 of 44 in Text-to-Image
Where the votes landed
DALL-E 2
0.0%
win rate
Ties
0.0%
GPT Image 1 Mini
100.0%
win rate
Challenge by challenge
The strongest take from each model on every shared challenge, with the AI judge's read.
Geometric Composition
Text-to-Image“A glass cube on a wooden table. Inside the cube is a small blue sphere. On top of the cube sits a red book. A green plant is behind the cube, partially visible through the glass. Soft window light from the left.”
AI Judge Analysis
DALL-E 2
- + Features a small glass cube with interesting light refraction.
- + Captures the wooden table texture well.
- − Fails almost all prompt instructions regarding object placement and identity.
- − Confuses the blue sphere with a massive blue background object.
- − The red book is depicted as a red core inside the cube rather than on top.
GPT Image 1 Mini
- + Follows all prompt instructions perfectly, including spatial relationships.
- + High visual clarity and realistic lighting representing the 'soft window light' from the left.
- + Excellent material rendering of glass, paper, and wood.
- − The plant is slightly out of focus, though this meets the 'partially visible' requirement.
Verdict: DALL-E 2 completely failed the spatial reasoning and object identification of the prompt, blending the colors into a single abstract object. GPT Image 1 Mini followed every instruction accurately, producing a coherent and high-quality scene that perfectly matches the requested composition.
Candid Street Photography
Text-to-Image“A candid street photo of an elderly Japanese man repairing a red bicycle in light rain, reflections on wet pavement, shallow depth of field, 50mm lens, natural skin texture, imperfect framing, motion blur from passing cars, cinematic but realistic, no stylization.”
AI Judge Analysis
DALL-E 2
- + Matches the 'imperfect framing' requirement well
- + Captures a very realistic 50mm shallow depth of field effect
- + Good wet pavement reflections
- − Subject is almost completely out of focus/blurred
- − Fails to clearly show a 'Japanese man' due to focus issues
- − Low resolution and grainy texture
GPT Image 1 Mini
- + Excellent adherence to the 'elderly Japanese man' subject with natural skin texture
- + Highly detailed rendering of the wet bicycle and rain drops
- + Strong cinematic composition while remaining realistic
- − The bokeh in the background is static; lacks the requested 'motion blur from passing cars'
- − Framing feels a bit too perfect and centered despite the prompt's request for 'imperfect framing'
Verdict: GPT Image 1 Mini is the clear winner as it provides a coherent, high-quality image that satisfies the core subject requirements (man, red bicycle, rain). While DALL-E 2 attempted to lean into the 'imperfect photography' aspects of the prompt, it resulted in a muddy, unrecognizable subject, whereas GPT Image 1 Mini delivered professional-grade cinematic realism.
Fantasy Warrior
Text-to-Image“Close portrait of a battle-worn paladin in ornate engraved plate armor, hair braided with small beads, faint scars and dirt on the skin, warm torchlight reflecting off metal, shallow depth of field, bokeh sparks, lifelike eyes, highly detailed texture on leather straps and cloth underlayer.”
AI Judge Analysis
DALL-E 2
- + Successfully captures a tight macro-style focus.
- + Uses high-contrast lighting that conveys a gritty atmosphere.
- − Severely lacks image coherence and clarity, appearing as a messy collage of textures.
- − Failed to render identifiable eyes, braided hair, or recognizable armor features.
- − Composition is confusing and lacks a clear focal point.
GPT Image 1 Mini
- + Excellent adherence to all prompt details including braided hair, scars, and ornate engraving.
- + High visual quality with realistic skin textures and lifelike eyes.
- + Beautiful lighting and bokeh that create a cinematic depth of field.
- − The 'small beads' requested in the hair are subtle to the point of being nearly invisible.
Verdict: GPT Image 1 Mini outperformed DALL-E 2 in every metric, delivering a clear and detailed portrait that followed the complex prompt precisely. DALL-E 2 failed to produce a coherent image, resulting in a distorted mess of textures that lacked a recognizable human face or the specific requests like braided hair.
Chalkboard Menu
Text-to-Image“Handwritten-style chalkboard menu in a cozy café, all text rendered in the exact same realistic chalk handwriting style with natural variations in letter size, slight slant, and chalk texture — no printed or digital fonts anywhere on the board. Title at the top in elegant cursive chalk handwriting: ‘TODAY’S SPECIALS – APRIL 30, 2026’. Below it, three menu items also in the same handwritten chalk style: ‘Truffle Mushroom Risotto – $24’, ‘Grilled Octopus with Lemon & Herbs – $28’, ‘Brown Butter Chocolate Chip Cookies – $9’. At the very bottom, smaller text in the identical handwritten chalk style (slightly smaller but still clearly legible with the same handwriting characteristics): ‘All items made fresh daily • Ask about our gluten-free options’. Warm ambient café lighting, visible chalk dust and smudges, realistic handwriting imperfections, no clean printed text anywhere.”
AI Judge Analysis
DALL-E 2
- + Captures a messy, authentic chalk texture
- − Text is completely illegible and gibberish
- − Fails to follow specific menu item instructions
- − The image is low resolution and poorly framed
GPT Image 1 Mini
- + Excellent text rendering with 100% accuracy to the prompt
- + Captures a realistic chalk grain texture on the letters
- + Clean and balanced composition with a professional wooden frame
- − The handwriting style is a bit too uniform, leaning towards a digital font feel despite the grain
Verdict: GPT Image 1 Mini followed the prompt perfectly, rendering the specific menu items and date with perfect spelling and legibility. In contrast, DALL-E 2 produced an illegible mess of pseudo-letters that failed every requirement of the text-to-image challenge.
The Capybara Taxi Driver
Text-to-Image“Photorealistic scene inside a yellow New York taxi at night. A capybara is driving, wearing a yellow taxi driver cap and a dark jacket. It has a calm, professional expression and both front paws on the steering wheel. In the back seat sits a human businesswoman in a coat, looking at her phone with a completely normal, bored expression (as if this is just another normal ride). Through the windows you can see the streets of Manhattan at night with blurred lights. Realistic taxi interior, photorealistic, detailed fur and fabric, 35mm lens, night lighting with reflections, shallow depth of field.”
AI Judge Analysis
DALL-E 2
- − Failed completely to follow the prompt.
- − Generated an image of a black handbag instead of a taxi scene.
- − Irrelevant to the user request.
GPT Image 1 Mini
- + Excellent adherence to all prompt details including the capybara, businessman, and setting.
- + High visual quality with realistic textures on the fur and leather jacket.
- + Successfully captured the lighting and atmosphere of a New York taxi at night.
- − The passenger's hand holding the phone looks slightly distorted.
- − The capybara has its right paw on the wheel but the left is less clearly visible.
Verdict: DALL-E 2 suffered a total failure, generating a close-up of a black handbag that has no relation to the prompt provided. GPT Image 1 Mini followed the complex prompt near-perfectly, delivering a high-quality, cinematic image that accurately captures the surreal concept with photorealistic detail.
Adorable Baby Animals in Sunny Meadow
Text-to-Image“Hyper-photorealistic scene of fluffy baby animals—a golden retriever puppy, tabby kitten, baby bunny, and red fox kit—with big expressive eyes and ultra-detailed soft fur, playfully chasing butterflies and tumbling together in a lush wildflower meadow, warm golden sunrise light with god rays and dew sparkles, joyful wholesome vibe, 8K masterpiece.”
AI Judge Analysis
DALL-E 2
- + Natural lighting on the puppy's fur
- + Good sense of movement
- − Severe anatomical distortions and blurred subjects
- − Failed to render distinct animal types clearly
GPT Image 1 Mini
- + Excellent clarity and adherence to all animal types
- + Superb golden hour lighting with god rays
- − Somewhat repetitive poses for the animals
Verdict: GPT Image 1 Mini captured the prompt's complexity perfectly, rendering all four requested animals with high fidelity and beautiful lighting. DALL-E 2 struggled with the multiple subjects, resulting in significant artifacts and a lack of coherent detail.
Explore each model
OpenAI's cost-effective image generation model for when image quality isn't the top priority