OpenAI's legacy image generation model supporting generations, edits with masks (inpainting), and variations
Settled by community votes across 12 shared challenges, with an AI judge weighing in on each.
DALL-E 2
#37 of 44 in Text-to-Image
Not enough comparable category data
The chart appears once both models have ratings across at least three shared arena categories.
Nano Banana
#20 of 44 in Text-to-Image
Where the votes landed
DALL-E 2
0.0%
win rate
Ties
0.0%
Nano Banana
100.0%
win rate
Challenge by challenge
The strongest take from each model on every shared challenge, with the AI judge's read.
Geometric Composition
Text-to-Image“A glass cube on a wooden table. Inside the cube is a small blue sphere. On top of the cube sits a red book. A green plant is behind the cube, partially visible through the glass. Soft window light from the left.”
AI Judge Analysis
DALL-E 2
- + Reflective wood surface looks realistic
- + Creative interpretation of colors
- − Failed almost all specific spatial prompt instructions
- − Objects are merged together and lack clear identity
- − The green plant is missing or replaced by a blue shape
Nano Banana
- + Excellent adherence to all spatial relationships in the prompt
- + High visual clarity and realistic lighting quality
- + Accurate rendering of the red book and blue sphere
- − The blue sphere is floating unnaturally within the cube
- − The red book is precariously balanced
Verdict: Nano Banana followed every instruction in the prompt, placing all objects in their correct relative positions with high clarity. DALL-E 2 struggled significantly with prompt adherence, failing to separate the sphere and book or include the plant behind the glass.
Candid Street Photography
Text-to-Image“A candid street photo of an elderly Japanese man repairing a red bicycle in light rain, reflections on wet pavement, shallow depth of field, 50mm lens, natural skin texture, imperfect framing, motion blur from passing cars, cinematic but realistic, no stylization.”
AI Judge Analysis
DALL-E 2
- + Strong bokeh and shallow depth of field effects.
- + Effective capture of wet pavement reflections.
- − The subject is heavily obscured and out of focus.
- − Fails to clearly show a Japanese man or the repair action.
Nano Banana
- + Excellent adherence to all prompt elements including rain and ethnicity.
- + High visual clarity and realistic skin textures.
- + Effective environmental storytelling with rain and passing cars.
- − The background cars lack the specific 'motion blur' requested.
- − Composition is quite balanced despite the 'imperfect framing' prompt.
Verdict: Nano Banana successfully interprets the entire prompt, delivering a clear, cinematic, and realistic image of the subject. DALL-E 2 fails the prompt by placing the focal point on the ground and obscuring the main subject in extreme blur.
Fantasy Warrior
Text-to-Image“Close portrait of a battle-worn paladin in ornate engraved plate armor, hair braided with small beads, faint scars and dirt on the skin, warm torchlight reflecting off metal, shallow depth of field, bokeh sparks, lifelike eyes, highly detailed texture on leather straps and cloth underlayer.”
AI Judge Analysis
DALL-E 2
- + Successfully captures a tight macro-style focus on the armor
- + Includes visible bokeh highlights as requested
- − Anatomically unintelligible and lacks a visible face
- − Extremely low visual quality with muddy, pixelated textures
- − Failed to include braids, beads, and eyes
Nano Banana
- + Excellent adherence to all prompt details including braids, beads, and scars
- + High-quality rendering of various materials like leather, metal, and cloth
- + Strong composition with effective torchlight lighting and bokeh effects
- − The hair braids are somewhat symmetrical and unnaturally stiff
- − The eyes, while clear, have a slightly 'painterly' rather than purely 'lifelike' photorealistic finish
Verdict: Nano Banana is the clear winner as it adheres to every single detail in the prompt, including the specific request for braided hair with beads and lifelike eyes. DALL-E 2 produced an abstract, low-resolution mess that fails to render a discernible character or follow the basic compositional requirements.
Modern Clean Menu
Text-to-Image“Modern minimalist restaurant menu design, white background with colorful food photos in grid, sections for appetizers/pizza/mains, bold sans-serif fonts, vibrant accents, clean professional layout for casual dining.”
AI Judge Analysis
DALL-E 2
- + Strong bold typography and a high-contrast minimalist aesthetic.
- + Artistic interpretation of the food grid using abstract shapes.
- − Fails to include distinct menu text sections for different food categories.
- − The food photos are fragmented and abstract, lacking clarity.
- − Text is illegible and nonsensical.
Nano Banana
- + Perfect adherence to all prompt requirements including sections for appetizers, pizza, and mains.
- + Excellent text legibility and clean professional layout.
- + High-quality, distinct food photography in a clear grid.
- − Minor spelling errors in text (e.g., 'Appetiers', 'Margheiita').
Verdict: Nano Banana followed the prompt perfectly, delivering a functional and aesthetically pleasing menu with identifiable sections and high-quality images. DALL-E 2 produced an abstract design that, while artistic, failed to include any of the requested content sections and suffered from messy typography.
Magic Burger Explosion: Fiery Photorealism Challenge
Text-to-Image“Ad for 'Magic Burger'. Dynamic, exploded burger with all components (bun, patty, cheese, lettuce, tomato, sauce) suspended in mid-air. Emphasize photorealistic detail and a sense of motion. Dark, fiery background with glowing embers. Integrate text: 'MAGIC BURGER' as a prominent title, 'LIMITED TIME ONLY' as a secondary message, and '€6.99' in a starburst, all rendered with a fiery, glowing effect.”
AI Judge Analysis
DALL-E 2
- + Captures the energetic, fiery atmosphere requested in the background
- − Text is heavily garbled and misspelled
- − Visual quality is blurry and lacks photorealistic detail
- − The burger components are poorly defined and look more like smears of light
Nano Banana
- + Perfect text rendering for all requested phrases including the currency symbol
- + High-quality photorealistic textures on the burger components and background
- + Excellent composition with a clear, dynamic 'exploded' layout
- − The meat patty is slightly disproportionate to the bun slices
Verdict: Nano Banana followed every instruction perfectly, producing a professional-grade advertisement with crisp text and realistic textures. DALL-E 2 failed significantly on text rendering, image clarity, and adherence to the specific 'exploded' burger request, resulting in a messy and low-resolution output.
Chalkboard Menu
Text-to-Image“Handwritten-style chalkboard menu in a cozy café, all text rendered in the exact same realistic chalk handwriting style with natural variations in letter size, slight slant, and chalk texture — no printed or digital fonts anywhere on the board. Title at the top in elegant cursive chalk handwriting: ‘TODAY’S SPECIALS – APRIL 30, 2026’. Below it, three menu items also in the same handwritten chalk style: ‘Truffle Mushroom Risotto – $24’, ‘Grilled Octopus with Lemon & Herbs – $28’, ‘Brown Butter Chocolate Chip Cookies – $9’. At the very bottom, smaller text in the identical handwritten chalk style (slightly smaller but still clearly legible with the same handwriting characteristics): ‘All items made fresh daily • Ask about our gluten-free options’. Warm ambient café lighting, visible chalk dust and smudges, realistic handwriting imperfections, no clean printed text anywhere.”
AI Judge Analysis
DALL-E 2
- + Captures a messy, legitimate chalk smudge texture.
- − Text is completely illegible and nonsensical.
- − Fails to follow any of the specific menu item prompts.
- − Extremely low resolution and lacks any environmental context.
Nano Banana
- + Excellent adherence to the complex text prompt with perfect spelling.
- + Provides a high-quality, realistic café environment with great lighting.
- + Features professional, stylistically consistent chalk handwriting.
- − The date '30, 2026' has some character overlap.
- − The handwriting looks slightly more like a digital font than natural human variations in some areas.
Verdict: Nano Banana exhibits exceptional prompt adherence by correctly rendering the complex menu items and date as requested, while also providing a rich, high-resolution background. DALL-E 2 fails significantly on this task, producing illegible 'gibberish' text and ignoring the specific menu item instructions entirely.
The Capybara Taxi Driver
Text-to-Image“Photorealistic scene inside a yellow New York taxi at night. A capybara is driving, wearing a yellow taxi driver cap and a dark jacket. It has a calm, professional expression and both front paws on the steering wheel. In the back seat sits a human businesswoman in a coat, looking at her phone with a completely normal, bored expression (as if this is just another normal ride). Through the windows you can see the streets of Manhattan at night with blurred lights. Realistic taxi interior, photorealistic, detailed fur and fabric, 35mm lens, night lighting with reflections, shallow depth of field.”
AI Judge Analysis
DALL-E 2
- + Displays a clear, recognizable texture of a leather object.
- − Failed completely to follow the prompt instructions.
- − Shows a black handbag instead of a taxi scene with a capybara.
Nano Banana
- + Excellent adherence to all complex prompt details including the driver and passenger.
- + Strong photographic quality with convincing low-light city atmosphere.
- + Successfully captures the requested expressions and clothing for both subjects.
- − The capybara's paw/hand anatomy is slightly anthropomorphized.
- − The steering wheel placement is shifted too far to the right relative to the driver.
Verdict: DALL-E 2 suffered a complete failure, generating an irrelevant image of a handbag. Nano Banana executed the prompt with high fidelity, accurately depicting the capybara taxi driver and the bored businesswoman in a realistic urban setting.
The Halloween Invitation
Text-to-Image“Vintage gothic Halloween party invitation. Dark parchment poster, spooky border with webs and thorns, central glowing jack-o-lantern, bats, twisted trees, moody night sky. Add elegant gothic title text saying "Halloween Party Invitation", a small scroll banner saying "You are invited to a night of frights", and event details at the bottom: Date: 30.10.2026 Time: 7pm Location: The Arches, NYC Spooky but polished, cinematic lighting, square format.”
AI Judge Analysis
DALL-E 2
- + Stylized hand-drawn aesthetic
- + Effective warm color palette
- − Unreadable garbled text
- − Missing key elements like the jack-o-lantern
Nano Banana
- + Excellent text legibility and accuracy
- + Highly polished atmospheric details
- + Perfect adherence to layout requests
- − Slightly generic digital art style
Verdict: Nano Banana followed all prompt instructions perfectly, including specific text strings and the required jack-o-lantern subject. DALL-E 2 produced a more abstract, vintage-looking poster but failed to render any legible text or the central character of the prompt.
Isometric Miniature Diorama Scenes
Text-to-Image“Create a clear, 45° top-down isometric miniature 3D cartoon scene of Japan's signature dish: sushi, with soft refined textures, realistic PBR materials, gentle lighting, on a small raised diorama base with minimal garnish and plate. Solid light blue background. At top-center: 'JAPAN' in large bold text, 'SUSHI' below it, small flag icon. Perfectly centered, ultra-clean, high-clarity, square format.”
AI Judge Analysis
DALL-E 2
- + Clean aesthetic for the background and base.
- − Failed to render the requested text 'JAPAN' and 'SUSHI' correctly.
- − Missing the flag icon.
- − The objects on the plate are unrecognizable as sushi.
- − Low level of detail and poor material realism.
Nano Banana
- + Perfect adherence to text and icon requirements.
- + Excellent miniature 3D cartoon style with soft, appealing textures.
- + High detail on the sushi pieces and diorama base.
- − None observed; captures every aspect of the prompt.
Verdict: Nano Banana followed every instruction perfectly, including complex text rendering, the flag icon, and a high-quality 3D diorama aesthetic. DALL-E 2 failed significantly, producing unrecognizable objects with misspelt text and missing the core Japanese theme indicators.
Adorable Baby Animals in Sunny Meadow
Text-to-Image“Hyper-photorealistic scene of fluffy baby animals—a golden retriever puppy, tabby kitten, baby bunny, and red fox kit—with big expressive eyes and ultra-detailed soft fur, playfully chasing butterflies and tumbling together in a lush wildflower meadow, warm golden sunrise light with god rays and dew sparkles, joyful wholesome vibe, 8K masterpiece.”
AI Judge Analysis
DALL-E 2
- + Features a dynamic sense of motion with the puppy running.
- − Anatomical disasters including messy paws, distorted faces, and extra appendages.
- − The lighting is flat and the butterfly looks like a painting artifact rather than a creature.
- − Lacks the requested animal variety and high-detail fur texture.
Nano Banana
- + Perfectly adheres to the technical prompt including all four specific animals and environmental details.
- + Excellent visual quality with ultra-detailed fur, expressive eyes, and god rays.
- + Balanced and charming composition that captures the 'wholesome vibe' requested.
- − The lighting leans slightly toward a 'digital illustration' fantasy feel rather than strict photo-realism.
- − Minor physics issues with how the butterfly is resting on the puppy's ear.
Verdict: Nano Banana is the clear winner as it successfully rendered all four requested animals with high fidelity and aesthetic appeal, whereas DALL-E 2 produced significant anatomical errors and failed to include the full lineup of animals. Nano Banana also perfectly executed the environmental requests like god rays and dew sparkles, which were largely absent or poorly rendered in DALL-E 2's output.
Vintage Cafe Logo
Text-to-Image“Vintage minimalist restaurant logo for "Caffè Florian", retro cloche dome with steam and "Est. 1720" banner, classic typography, warm brown and cream tones, subtle texture on light background, vector emblem style.”
AI Judge Analysis
DALL-E 2
- + Follows the minimalist prompt with a simple icon and color palette.
- + Includes a stylized cloche dome element.
- − Text is completely illegible and nonsensical.
- − Fails to include the required 'Est. 1720' banner.
- − Graphic elements are rough and lack vector-like polish.
Nano Banana
- + Excellent typography that correctly spells 'Caffè Florian' and 'Est. 1720'.
- + High-quality vector emblem style with clean lines and balanced composition.
- + Successfully incorporates all prompt elements including the banner and subtle paper texture.
- − The 'steam' is represented by abstract flourishes rather than distinct vapor clouds.
Verdict: Nano Banana is the clear winner as it successfully rendered all text elements accurately and followed the complex layout instructions perfectly. DALL-E 2 failed to produce legible text and missed the banner element entirely, resulting in a disorganized and unusable logo compared to the professional execution of Nano Banana.
Apollo 11: Journey to Tranquility
Text-to-Image“Create a clean, modern vector infographic poster about the Apollo 11 mission. NASA-inspired palette (navy, white, muted red, light gray). Flat-vector style, crisp lines, consistent iconography, subtle gradients only. Steps (stop at landing): 1. Launch (Saturn Vicon) 2. Earth Orbit (Earth + orbit ring icon) 3. Translunar (trajectory arc icon) 4. Lunar Orbit (Moon + orbit ring icon) 5. Descent (lunar module descending icon) 6. Landing (lunar module on the surface icon) Small supporting elements (minimal text): • Crew strip: three silhouette icons with only last names: Armstrong, Aldrin, Collins. • Landing site marker: Moon pin labeled "Tranquility" only. Layout constraints: generous margins, large readable labels, clean background with subtle stars. Vector-only, print-poster look, high resolution.”
AI Judge Analysis
DALL-E 2
- + Captures a technical laboratory aesthetic.
- − Text is completely illegible and gibberish.
- − The layout is cluttered and disorganized.
- − Fails to follow the requested 6-step infographic structure.
Nano Banana
- + Follows the NASA color palette and flat-vector style perfectly.
- + Text and labels are legible and mostly accurate.
- + Provides specific icons for the requested mission steps in a clean layout.
- − Skips step 5 (Descent) in the sequence.
- − Minor misspelling in 'ARMSRONG' and 'TRANQUILITY' (usually spelled with two Ls in US English context, though one L is acceptable).
Verdict: Nano Banana successfully creates a clean, professional vector infographic that follows the prompt's aesthetic and structural requirements, despite skipping one of the six requested steps. DALL-E 2 fails significantly, producing a cluttered image with illegible text and no clear logical progression of the Apollo 11 mission.
Explore each model
Gemini 2.5 Flash Image is optimized for image understanding and generation, offering a balance of price and performance with fast and efficient image generation and editing capabilities.