DALL-E 2 OpenAI FLUX.2 [dev] Black Forest Labs

Settled by community votes across 4 shared challenges, with an AI judge weighing in on each.

DALL-E 2

17.7 arena score

#37 of 44 in Text-to-Image

Skill signature

Not enough comparable category data

The chart appears once both models have ratings across at least three shared arena categories.

FLUX.2 [dev]

24.5 arena score

#17 of 44 in Text-to-Image

Vote tally

Where the votes landed

DALL-E 2

0.0%

win rate

Ties

0.0%

FLUX.2 [dev]

100.0%

win rate

0.0% 0.0% ties 100.0%

Shared challenges 4

Challenge by challenge

The strongest take from each model on every shared challenge, with the AI judge's read.

Geometric Composition

Text-to-Image

“A glass cube on a wooden table. Inside the cube is a small blue sphere. On top of the cube sits a red book. A green plant is behind the cube, partially visible through the glass. Soft window light from the left.”

DALL-E 2

FLUX.2 [dev]

0% wins 0% ties 100% wins

AI Judge Analysis

DALL-E 2

+ Decent glass reflection on the table surface.

− Failed almost every prompt instruction.
− The blue sphere is missing, replaced by a giant blue pot.
− The red book is missing, replaced by a red core inside the cube.
− Poor image clarity and scale.

FLUX.2 [dev]

+ Perfect adherence to all spatial and object requirements.
+ High photorealism with realistic glass refraction and soft lighting.
+ Excellent composition and depth of field.

− The window light is slightly more direct than 'soft', though still pleasant.

Verdict: FLUX.2 [dev] followed every detail of the prompt accurately, successfully placing the blue sphere inside the cube and the red book on top. DALL-E 2 failed significantly, confusing the colors and positions of the objects, resulting in a giant blue pot in the background and no book at all.

Candid Street Photography

Text-to-Image

“A candid street photo of an elderly Japanese man repairing a red bicycle in light rain, reflections on wet pavement, shallow depth of field, 50mm lens, natural skin texture, imperfect framing, motion blur from passing cars, cinematic but realistic, no stylization.”

DALL-E 2

FLUX.2 [dev]

AI Judge Analysis

DALL-E 2

+ Successfully captures reflections on wet pavement
+ Achieves an extremely shallow depth of field

− The subject and bicycle are completely out of focus, losing all detail
− Fails to show that the man is elderly or Japanese due to blur
− Low resolution and lacks the requested cinematic realism

FLUX.2 [dev]

+ Excellent adherence to all prompt details, including the elderly Japanese man and the red bicycle
+ Highly realistic skin textures and wet weather effects
+ Perfectly executes the motion blur of passing cars in the background

− The framing is quite centered, missing the 'imperfect framing' request slightly

Verdict: FLUX.2 [dev] followed every aspect of the prompt with high fidelity, creating a poignant and technically impressive photo with realistic textures and lighting. DALL-E 2 struggled significantly, producing a muddy, out-of-focus image where the primary subjects are unrecognizable, failing the core requirements of the prompt.

Fantasy Warrior

Text-to-Image

“Close portrait of a battle-worn paladin in ornate engraved plate armor, hair braided with small beads, faint scars and dirt on the skin, warm torchlight reflecting off metal, shallow depth of field, bokeh sparks, lifelike eyes, highly detailed texture on leather straps and cloth underlayer.”

DALL-E 2

FLUX.2 [dev]

AI Judge Analysis

DALL-E 2

+ Features a very close-up, abstract framing that emphasizes the worn metal texture.

− Extremely messy execution with significant digital artifacts and poor clarity.
− Fails most prompt requirements including lifelike eyes, braided hair with beads, and leather textures.
− Anatomical features are unrecognizable and distorted.

FLUX.2 [dev]

+ Perfect adherence to all prompt elements including braided hair with beads, scars, and ornate engraving.
+ Exceptional photographic clarity with lifelike eyes and realistic skin textures.
+ Excellent lighting effects with the torch reflection and atmospheric bokeh sparks.

− The leather straps overlap the plate armor in a way that slightly hides the 'close portrait' facial focus.

Verdict: FLUX.2 [dev] significantly outperforms DALL-E 2 by providing a high-fidelity, photorealistic image that follows every detail of the complex prompt. DALL-E 2 produced a low-resolution, distorted image where the subject's face is barely identifiable and most specific details like the beaded braids are completely missing.

Adorable Baby Animals in Sunny Meadow

Text-to-Image

“Hyper-photorealistic scene of fluffy baby animals—a golden retriever puppy, tabby kitten, baby bunny, and red fox kit—with big expressive eyes and ultra-detailed soft fur, playfully chasing butterflies and tumbling together in a lush wildflower meadow, warm golden sunrise light with god rays and dew sparkles, joyful wholesome vibe, 8K masterpiece.”

DALL-E 2

FLUX.2 [dev]

AI Judge Analysis

DALL-E 2

+ Dynamic sense of movement and energy
+ Displays butterflies as requested

− Severe anatomical distortions in the background animals
− Blurry textures and low resolution compared to modern standards
− Poor blending and messy edges on the butterfly and fur

FLUX.2 [dev]

+ Expertly rendered textures and realistic fur details
+ Perfectly captures the golden hour lighting, god rays, and dew sparkles
+ All requested animals are present with distinct, high-quality features

− The characters are more static/posing than 'tumbling' as requested

Verdict: FLUX.2 [dev] significantly outperforms DALL-E 2 in every technical metric, providing a high-fidelity image with realistic fur, lighting, and anatomy. DALL-E 2 struggled with the complexity of multiple animals, resulting in distorted, unrecognizable shapes in the background and a generally low-quality, painterly aesthetic.