DALL-E 2 OpenAI FLUX.2 [max] Black Forest Labs

Settled by community votes across 9 shared challenges, with an AI judge weighing in on each.

DALL-E 2

17.7 arena score

#37 of 44 in Text-to-Image

Skill signature

Not enough comparable category data

The chart appears once both models have ratings across at least three shared arena categories.

FLUX.2 [max]

25.9 arena score

#11 of 44 in Text-to-Image

Vote tally

Where the votes landed

DALL-E 2

0.0%

win rate

Ties

0.0%

FLUX.2 [max]

100.0%

win rate

0.0% 0.0% ties 100.0%

Shared challenges 9

Challenge by challenge

The strongest take from each model on every shared challenge, with the AI judge's read.

Geometric Composition

Text-to-Image

“A glass cube on a wooden table. Inside the cube is a small blue sphere. On top of the cube sits a red book. A green plant is behind the cube, partially visible through the glass. Soft window light from the left.”

DALL-E 2

FLUX.2 [max]

0% wins 0% ties 100% wins

AI Judge Analysis

DALL-E 2

+ Features a small glass cube on a wooden surface.
+ Correctly places a plant in the background.

− Fails to include the blue sphere inside the cube (the cube itself appears red).
− Fails to place a red book on top of the cube.
− The scale of objects is confusing with a massive blue pot behind a tiny cube.

FLUX.2 [max]

+ Excellent prompt adherence including the blue sphere inside and a red book on top.
+ High-quality rendering with realistic light rays and shadows.
+ Accurately represents the transparency and refraction of the glass cube.

− The light comes more from the front-right than the requested left side.

Verdict: DALL-E 2 failed significantly on the spatial logic of the prompt, missing the red book and blue sphere entirely while confusing the colors of the objects. FLUX.2 [max] followed every instruction perfectly, producing a photorealistic image with correct object placement and impressive secondary details like reflections and refractive surfaces.

Candid Street Photography

Text-to-Image

“A candid street photo of an elderly Japanese man repairing a red bicycle in light rain, reflections on wet pavement, shallow depth of field, 50mm lens, natural skin texture, imperfect framing, motion blur from passing cars, cinematic but realistic, no stylization.”

DALL-E 2

FLUX.2 [max]

AI Judge Analysis

DALL-E 2

+ Captures the 'imperfect framing' part of the prompt well
+ Atmospheric use of puddle reflections

− Extreme blur renders the subject unidentifiable as an elderly Japanese man
− Fails to show any detail of the bicycle or the act of repairing it
− Composition is poor and lacks a clear focal point

FLUX.2 [max]

+ Excellent adherence to all prompt details including age, ethnicity, and environment
+ High visual quality with realistic skin texture and rain effects
+ Successfully balances motion blur in background cars with sharp detail on the subject

− The composition is a bit too 'perfect' for a requested 'candid street photo'
− Minor anatomical/mechanical clipping where the kickstand meets the ground

Verdict: DALL-E 2 struggled significantly with the prompt, producing an over-blurred image where the subject is unrecognizable. In contrast, FLUX.2 [max] followed every instruction accurately, delivering a cinematic, detailed, and realistic depiction of the scene with high technical fidelity.

Modern Clean Menu

Text-to-Image

“Modern minimalist restaurant menu design, white background with colorful food photos in grid, sections for appetizers/pizza/mains, bold sans-serif fonts, vibrant accents, clean professional layout for casual dining.”

DALL-E 2

FLUX.2 [max]

AI Judge Analysis

DALL-E 2

+ Strong bold sans-serif typography as requested.
+ Creative approach to food photography as a geometric mosaic.

− Failed the layout requirements entirely, appearing more like a magazine spread than a menu.
− Text is completely unintelligible and garbled.
− Images of food are distorted and low resolution.

FLUX.2 [max]

+ Excellent adherence to the menu structure, including sections for Appetizers, Pizza, and Mains.
+ High-quality, realistic food photography organized in a clean grid.
+ Professional layout with clear pricing, icons, and social media handles.

− Minor text artifacts in the smaller descriptions.
− Content is somewhat repetitive (mostly pizzas shown in photos regardless of category).

Verdict: FLUX.2 [max] produced a highly professional and usable menu design that perfectly followed every aspect of the prompt, including the specific sections requested. DALL-E 2 failed to create a functional menu, instead generating a distorted book-like layout with nonsensical text and poor image quality.

Magic Burger Explosion: Fiery Photorealism Challenge

Text-to-Image

“Ad for 'Magic Burger'. Dynamic, exploded burger with all components (bun, patty, cheese, lettuce, tomato, sauce) suspended in mid-air. Emphasize photorealistic detail and a sense of motion. Dark, fiery background with glowing embers. Integrate text: 'MAGIC BURGER' as a prominent title, 'LIMITED TIME ONLY' as a secondary message, and '€6.99' in a starburst, all rendered with a fiery, glowing effect.”

DALL-E 2

FLUX.2 [max]

AI Judge Analysis

DALL-E 2

+ Strong artistic, glowing effect on the burger elements
+ Dynamic sense of motion with particles

− Text is misspelled and illegible
− The burger components lack photorealistic detail and look distorted
− Failed to include the price starburst or the secondary message

FLUX.2 [max]

+ Perfect text rendering of all requested phrases including the price
+ Highly photorealistic textures on the bun, meat, and vegetables
+ Accurately follows all prompt requirements including the 'exploded' layout and starry price burst

− The composition is a bit crowded with the large starburst overlapping the background elements

Verdict: FLUX.2 [max] is the clear winner as it followed every instruction in the prompt, including complex text and specific price formatting. DALL-E 2 struggled significantly with text legibility and image clarity, producing a muddy result where the burger components were difficult to identify.

The Capybara Taxi Driver

Text-to-Image

“Photorealistic scene inside a yellow New York taxi at night. A capybara is driving, wearing a yellow taxi driver cap and a dark jacket. It has a calm, professional expression and both front paws on the steering wheel. In the back seat sits a human businesswoman in a coat, looking at her phone with a completely normal, bored expression (as if this is just another normal ride). Through the windows you can see the streets of Manhattan at night with blurred lights. Realistic taxi interior, photorealistic, detailed fur and fabric, 35mm lens, night lighting with reflections, shallow depth of field.”

DALL-E 2

FLUX.2 [max]

AI Judge Analysis

DALL-E 2

+ None significant.

− Completely failed to follow the prompt by generating a black leather handbag instead of the requested scene.
− Resolution and texture look like a dated digital render of a physical object.
− Missing all requested characters and environment.

FLUX.2 [max]

+ Excellent adherence to all prompt details including the capybara, its clothing, and the bored businesswoman.
+ High cinematic visual quality with realistic lighting and depth of field.
+ Successfully captures the specific requested mood and composition.

− The capybara has distinct human hands on the steering wheel instead of paws.
− The proportions of the car's window frame and pillar are slightly distorted.

Verdict: DALL-E 2 suffered a complete failure, providing an image of a handbag that bears no relevance to the prompt. FLUX.2 [max] delivered a high-quality, humorous, and accurate interpretation of the scene, though it struggled with the specific request for animal paws by giving the capybara human hands.

Isometric Miniature Diorama Scenes

Text-to-Image

“Create a clear, 45° top-down isometric miniature 3D cartoon scene of Japan's signature dish: sushi, with soft refined textures, realistic PBR materials, gentle lighting, on a small raised diorama base with minimal garnish and plate. Solid light blue background. At top-center: 'JAPAN' in large bold text, 'SUSHI' below it, small flag icon. Perfectly centered, ultra-clean, high-clarity, square format.”

DALL-E 2

FLUX.2 [max]

AI Judge Analysis

DALL-E 2

+ Matches the diagonal isometric perspective requested.

− Fails significantly on text rendering, displaying 'Sush' instead of 'JAPAN' and 'SUSHI'.
− Low visual quality with blurred textures and indistinct objects.
− Missing requested elements like the small flag icon and diorama base.

FLUX.2 [max]

+ Perfect adherence to all prompt instructions including text, flag icon, and diorama layout.
+ High-quality 3D cartoon aesthetic with clean, refined textures and lighting.
+ Excellent composition and center-alignment for the requested square format.

− Minor shadow clipping on the very bottom edge of the diorama base.

Verdict: FLUX.2 [max] followed every instruction perfectly, delivering professional-grade 3D graphics and accurate text rendering. DALL-E 2 failed on most of the specific requirements, including text accuracy, object clarity, and the diorama base requested.

Adorable Baby Animals in Sunny Meadow

Text-to-Image

“Hyper-photorealistic scene of fluffy baby animals—a golden retriever puppy, tabby kitten, baby bunny, and red fox kit—with big expressive eyes and ultra-detailed soft fur, playfully chasing butterflies and tumbling together in a lush wildflower meadow, warm golden sunrise light with god rays and dew sparkles, joyful wholesome vibe, 8K masterpiece.”

DALL-E 2

FLUX.2 [max]

AI Judge Analysis

DALL-E 2

+ Successfully captures a sense of movement and energy
+ Warm golden lighting fits the requested sunrise theme

− Severely lacks anatomical accuracy with distorted facial features and merged bodies
− Fails to include all requested animals clearly, missing the bunny entirely
− Significant artifacts and blurry textures throughout the image

FLUX.2 [max]

+ Excellent adherence to the prompt, including all four specific animals and butterflies
+ Superior visual quality with sharp fur textures and photorealistic details
+ Beautiful composition with professional-level lighting, god rays, and dew effects

− The fox kit has slightly elongated limbs that look a bit unnatural in mid-air
− The butterflies are somewhat static rather than blurred by motion

Verdict: FLUX.2 [max] is the clear winner, delivering a high-fidelity image that meticulously follows every detail of the prompt including the specific list of animals and atmospheric effects. In contrast, DALL-E 2 produced a low-resolution, distorted image with significant anatomical errors and missing elements.

Vintage Cafe Logo

Text-to-Image

“Vintage minimalist restaurant logo for "Caffè Florian", retro cloche dome with steam and "Est. 1720" banner, classic typography, warm brown and cream tones, subtle texture on light background, vector emblem style.”

DALL-E 2

FLUX.2 [max]

0% wins 0% ties 100% wins

AI Judge Analysis

DALL-E 2

+ Matches the requested warm brown palette
+ Includes a cloche icon

− Text is nonsensical gibberish
− The 'steam' icon is poorly rendered and disconnected
− Graphic elements are jagged and lack professional vector quality

FLUX.2 [max]

+ Perfect text rendering for both name and date
+ Clean vector-style composition with high-quality lines
+ Accurately represents all prompt elements including the banner and subtle paper texture

− Minimal shading on the cloche could be more nuanced

Verdict: FLUX.2 [max] is the clear winner as it perfectly followed all instructions, including difficult text rendering and specific layout elements like the banner. DALL-E 2 produced a low-quality, illegible graphic that failed to render the requested name or established date correctly.

Apollo 11: Journey to Tranquility

Text-to-Image

“Create a clean, modern vector infographic poster about the Apollo 11 mission. NASA-inspired palette (navy, white, muted red, light gray). Flat-vector style, crisp lines, consistent iconography, subtle gradients only. Steps (stop at landing): 1. Launch (Saturn Vicon) 2. Earth Orbit (Earth + orbit ring icon) 3. Translunar (trajectory arc icon) 4. Lunar Orbit (Moon + orbit ring icon) 5. Descent (lunar module descending icon) 6. Landing (lunar module on the surface icon) Small supporting elements (minimal text): • Crew strip: three silhouette icons with only last names: Armstrong, Aldrin, Collins. • Landing site marker: Moon pin labeled "Tranquility" only. Layout constraints: generous margins, large readable labels, clean background with subtle stars. Vector-only, print-poster look, high resolution.”

DALL-E 2

FLUX.2 [max]

AI Judge Analysis

DALL-E 2

+ Features the requested navy, white, and red color palette
+ Captures a complex diagrammatic feel

− Text consists entirely of illegible gibberish and misspellings like 'ALLPOO'
− Fails to follow the logical step-by-step sequence requested
− Composition is cluttered and chaotic

FLUX.2 [max]

+ Excellent adherence to the requested flat-vector style and iconography
+ Accurately represents the specific steps (Launch, Orbit, Translunar, etc.)
+ Highly legible text and clean, modern layout

− Labeling error where 'Earth Orbit' is repeated twice on different steps
− Uniform silhouettes look more like modern airline pilots than 1960s astronauts

Verdict: FLUX.2 [max] clearly outperforms DALL-E 2 by providing a coherent, logical, and aesthetically pleasing infographic that follows the prompt's instructions for specific steps and styling. While DALL-E 2 produced a messy abstract design with nonsensical text, FLUX.2 [max] created a functional educational graphic with consistent iconography and readable labels.

Next steps

Explore each model

DALL-E 2

OpenAI

OpenAI's legacy image generation model supporting generations, edits with masks (inpainting), and variations

Vote this model in the arena

Arena profile Lumenfall catalog

FLUX.2 [max]

Black Forest Labs

Black Forest Labs' flagship image generation model delivering state-of-the-art quality with exceptional realism, precision, and consistency for both text-to-image and advanced image editing

Vote this model in the arena

Arena profile Lumenfall catalog