Black Forest Labs' open-weights image generation model with frontier performance, available for non-commercial local deployment
Settled by community votes across 4 shared challenges, with an AI judge weighing in on each.
FLUX.2 [dev]
#17 of 44 in Text-to-Image
Not enough comparable category data
The chart appears once both models have ratings across at least three shared arena categories.
GPT Image 1.5
#7 of 44 in Text-to-Image
Where the votes landed
FLUX.2 [dev]
0.0%
win rate
Ties
16.7%
GPT Image 1.5
83.3%
win rate
Challenge by challenge
The strongest take from each model on every shared challenge, with the AI judge's read.
Geometric Composition
Text-to-Image“A glass cube on a wooden table. Inside the cube is a small blue sphere. On top of the cube sits a red book. A green plant is behind the cube, partially visible through the glass. Soft window light from the left.”
AI Judge Analysis
FLUX.2 [dev]
- + Excellent photographic quality with realistic soft lighting and depth of field.
- + Accurate spatial arrangement of objects.
- + High rendering quality of the glass textures and reflections.
- − The blue sphere appears to be floating inside the cube rather than resting on the bottom.
GPT Image 1.5
- + Strong adherence to all prompt elements including the green plant visible through the glass.
- + The glass cube has realistic beveling and an interesting mirrored base.
- − The lighting is a bit flat compared to Model A.
- − The blue sphere is quite large, pushing the definition of 'small sphere'.
Verdict: Both models followed the complex spatial instructions perfectly. FLUX.2 [dev] produced a more aesthetically pleasing, professional-grade photograph with superior lighting, though GPT Image 1.5 offered better clarity for the plant behind the glass. FLUX.2 [dev] is the winner due to its better overall visual composition and realism.
Candid Street Photography
Text-to-Image“A candid street photo of an elderly Japanese man repairing a red bicycle in light rain, reflections on wet pavement, shallow depth of field, 50mm lens, natural skin texture, imperfect framing, motion blur from passing cars, cinematic but realistic, no stylization.”
AI Judge Analysis
FLUX.2 [dev]
- + Excellent execution of motion blur from passing cars
- + High realism in facial textures and clothing folds
- + Successfully captures several technical prompts like 50mm feel and reflections
- − The structural anatomy of the bicycle is incoherent near the handlebars and seat
- − The man appears to be sitting on air or a floating seat
GPT Image 1.5
- + Stronger composition with the subject grounded in a crouching position
- + Great attention to detail with the toolkit and puddle reflections
- + Bicycle geometry is much more realistic and logical
- − Lacks the requested motion blur from passing cars
- − The car in the background looks slightly static despite the rain
Verdict: FLUX.2 [dev] followed the technical camera prompts more closely, especially the motion blur of passing cars and the candid framing, but failed significantly on the physical logic of the bicycle. GPT Image 1.5 produced a much more coherent and grounded scene with better object details, though it missed the specific kinetic energy of the motion blur request. GPT Image 1.5 is preferred for its overall visual consistency and anatomical accuracy.
Fantasy Warrior
Text-to-Image“Close portrait of a battle-worn paladin in ornate engraved plate armor, hair braided with small beads, faint scars and dirt on the skin, warm torchlight reflecting off metal, shallow depth of field, bokeh sparks, lifelike eyes, highly detailed texture on leather straps and cloth underlayer.”
AI Judge Analysis
FLUX.2 [dev]
- + Excellent depiction of ornate engraved plate armor with high-quality metal textures.
- + Includes clearly visible beads in the braided hair as requested.
- + Realistic lighting from the torch source reflecting on the metallic surfaces.
- − The facial skin texture is slightly smoother and less 'battle-worn' than Model B.
- − The beads on the braids look a bit like modern jewelry compared to the medieval setting.
GPT Image 1.5
- + Incredible skin texture with realistic pores, grime, and gritty battle-worn detail.
- + Stronger 'paladin' aesthetic with the cross insignia and rugged cloth underlayers.
- + Exceptional eye detail and lifelike expression.
- − The beads in the hair are less prominent and look more like metallic rings/bands than beads.
- − A bit more lens flare/haze which slightly obscuring the fine engravings on the central armor piece.
Verdict: Both models followed the prompt exceptionally well, but GPT Image 1.5 wins due to the superior 'battle-worn' skin texture and the more thematic paladin aesthetics. While FLUX.2 [dev] did a better job with the specific request for 'beads' in the hair, GPT Image 1.5's overall realism, cinematic lighting, and detailed leather and cloth textures make for a more compelling portrait.
Adorable Baby Animals in Sunny Meadow
Text-to-Image“Hyper-photorealistic scene of fluffy baby animals—a golden retriever puppy, tabby kitten, baby bunny, and red fox kit—with big expressive eyes and ultra-detailed soft fur, playfully chasing butterflies and tumbling together in a lush wildflower meadow, warm golden sunrise light with god rays and dew sparkles, joyful wholesome vibe, 8K masterpiece.”
AI Judge Analysis
FLUX.2 [dev]
- + Excellent realism in fur textures and backlighting.
- + Correctly includes all requested animals (including a second bunny).
- + Balanced composition with clear, sharp details on the butterflies and flowers.
- − The animals are mostly sitting rather than 'playfully chasing' or 'tumbling'.
- − The kitten's eye anatomy is slightly off (one pupil is larger/distorted).
GPT Image 1.5
- + Better captures the 'tumbling' and 'playful chasing' aspect of the prompt.
- + Highly expressive facial expressions that fit the 'joyful vibe'.
- + Strong use of 'god rays' and bokeh effects to create a magical atmosphere.
- − Anatomical errors such as the kitten having five visible paws/limbs/appendages.
- − The fox's front right paw is an indistinct dark mass that looks disconnected.
- − The dog's left ear merges awkwardly into its body.
Verdict: While GPT Image 1.5 does a much better job of capturing the active 'tumbling' motion and joyful energy requested in the prompt, it suffers from significant anatomical errors, particularly with the kitten's limbs. FLUX.2 [dev] produces a much cleaner, more physically coherent image with superior texture rendering, even if the composition is more static than requested.
Explore each model
OpenAI's state-of-the-art image generation model with better instruction following and adherence to prompts