FLUX.2 [dev] Black Forest Labs GPT Image 1.5 OpenAI

Settled by community votes across 4 shared challenges, with an AI judge weighing in on each.

FLUX.2 [dev]

24.5 arena score

#17 of 44 in Text-to-Image

Skill signature

Not enough comparable category data

The chart appears once both models have ratings across at least three shared arena categories.

GPT Image 1.5

26.5 arena score

#7 of 44 in Text-to-Image

Top 3 in Image Editing

Vote tally

Where the votes landed

FLUX.2 [dev]

0.0%

win rate

Ties

16.7%

GPT Image 1.5

83.3%

win rate

0.0% 16.7% ties 83.3%

Shared challenges 4

Challenge by challenge

The strongest take from each model on every shared challenge, with the AI judge's read.

Geometric Composition

Text-to-Image

“A glass cube on a wooden table. Inside the cube is a small blue sphere. On top of the cube sits a red book. A green plant is behind the cube, partially visible through the glass. Soft window light from the left.”

FLUX.2 [dev]

GPT Image 1.5

0% wins 25% ties 75% wins

AI Judge Analysis

FLUX.2 [dev]

+ Excellent photographic quality with realistic soft lighting and depth of field.
+ Accurate spatial arrangement of objects.
+ High rendering quality of the glass textures and reflections.

− The blue sphere appears to be floating inside the cube rather than resting on the bottom.

GPT Image 1.5

+ Strong adherence to all prompt elements including the green plant visible through the glass.
+ The glass cube has realistic beveling and an interesting mirrored base.

− The lighting is a bit flat compared to Model A.
− The blue sphere is quite large, pushing the definition of 'small sphere'.

Verdict: Both models followed the complex spatial instructions perfectly. FLUX.2 [dev] produced a more aesthetically pleasing, professional-grade photograph with superior lighting, though GPT Image 1.5 offered better clarity for the plant behind the glass. FLUX.2 [dev] is the winner due to its better overall visual composition and realism.

Candid Street Photography

Text-to-Image

“A candid street photo of an elderly Japanese man repairing a red bicycle in light rain, reflections on wet pavement, shallow depth of field, 50mm lens, natural skin texture, imperfect framing, motion blur from passing cars, cinematic but realistic, no stylization.”

FLUX.2 [dev]

GPT Image 1.5

AI Judge Analysis

FLUX.2 [dev]

+ Excellent execution of motion blur from passing cars
+ High realism in facial textures and clothing folds
+ Successfully captures several technical prompts like 50mm feel and reflections

− The structural anatomy of the bicycle is incoherent near the handlebars and seat
− The man appears to be sitting on air or a floating seat

GPT Image 1.5

+ Stronger composition with the subject grounded in a crouching position
+ Great attention to detail with the toolkit and puddle reflections
+ Bicycle geometry is much more realistic and logical

− Lacks the requested motion blur from passing cars
− The car in the background looks slightly static despite the rain

Verdict: FLUX.2 [dev] followed the technical camera prompts more closely, especially the motion blur of passing cars and the candid framing, but failed significantly on the physical logic of the bicycle. GPT Image 1.5 produced a much more coherent and grounded scene with better object details, though it missed the specific kinetic energy of the motion blur request. GPT Image 1.5 is preferred for its overall visual consistency and anatomical accuracy.

Fantasy Warrior

Text-to-Image

“Close portrait of a battle-worn paladin in ornate engraved plate armor, hair braided with small beads, faint scars and dirt on the skin, warm torchlight reflecting off metal, shallow depth of field, bokeh sparks, lifelike eyes, highly detailed texture on leather straps and cloth underlayer.”

FLUX.2 [dev]

GPT Image 1.5

0% wins 0% ties 100% wins

AI Judge Analysis

FLUX.2 [dev]

+ Excellent depiction of ornate engraved plate armor with high-quality metal textures.
+ Includes clearly visible beads in the braided hair as requested.
+ Realistic lighting from the torch source reflecting on the metallic surfaces.

− The facial skin texture is slightly smoother and less 'battle-worn' than Model B.
− The beads on the braids look a bit like modern jewelry compared to the medieval setting.

GPT Image 1.5

+ Incredible skin texture with realistic pores, grime, and gritty battle-worn detail.
+ Stronger 'paladin' aesthetic with the cross insignia and rugged cloth underlayers.
+ Exceptional eye detail and lifelike expression.

− The beads in the hair are less prominent and look more like metallic rings/bands than beads.
− A bit more lens flare/haze which slightly obscuring the fine engravings on the central armor piece.

Verdict: Both models followed the prompt exceptionally well, but GPT Image 1.5 wins due to the superior 'battle-worn' skin texture and the more thematic paladin aesthetics. While FLUX.2 [dev] did a better job with the specific request for 'beads' in the hair, GPT Image 1.5's overall realism, cinematic lighting, and detailed leather and cloth textures make for a more compelling portrait.

Adorable Baby Animals in Sunny Meadow

Text-to-Image

“Hyper-photorealistic scene of fluffy baby animals—a golden retriever puppy, tabby kitten, baby bunny, and red fox kit—with big expressive eyes and ultra-detailed soft fur, playfully chasing butterflies and tumbling together in a lush wildflower meadow, warm golden sunrise light with god rays and dew sparkles, joyful wholesome vibe, 8K masterpiece.”

FLUX.2 [dev]

GPT Image 1.5

AI Judge Analysis

FLUX.2 [dev]

+ Excellent realism in fur textures and backlighting.
+ Correctly includes all requested animals (including a second bunny).
+ Balanced composition with clear, sharp details on the butterflies and flowers.

− The animals are mostly sitting rather than 'playfully chasing' or 'tumbling'.
− The kitten's eye anatomy is slightly off (one pupil is larger/distorted).

GPT Image 1.5

+ Better captures the 'tumbling' and 'playful chasing' aspect of the prompt.
+ Highly expressive facial expressions that fit the 'joyful vibe'.
+ Strong use of 'god rays' and bokeh effects to create a magical atmosphere.

− Anatomical errors such as the kitten having five visible paws/limbs/appendages.
− The fox's front right paw is an indistinct dark mass that looks disconnected.
− The dog's left ear merges awkwardly into its body.

Verdict: While GPT Image 1.5 does a much better job of capturing the active 'tumbling' motion and joyful energy requested in the prompt, it suffers from significant anatomical errors, particularly with the kitten's limbs. FLUX.2 [dev] produces a much cleaner, more physically coherent image with superior texture rendering, even if the composition is more static than requested.