Black Forest Labs' open-weights image generation model with frontier performance, available for non-commercial local deployment
Settled by community votes across 4 shared challenges, with an AI judge weighing in on each.
FLUX.2 [dev]
#17 of 44 in Text-to-Image
Not enough comparable category data
The chart appears once both models have ratings across at least three shared arena categories.
Stable Diffusion 3.5 Large
#25 of 44 in Text-to-Image
Where the votes landed
FLUX.2 [dev]
75.0%
win rate
Ties
0.0%
Stable Diffusion 3.5 Large
25.0%
win rate
Challenge by challenge
The strongest take from each model on every shared challenge, with the AI judge's read.
Geometric Composition
Text-to-Image“A glass cube on a wooden table. Inside the cube is a small blue sphere. On top of the cube sits a red book. A green plant is behind the cube, partially visible through the glass. Soft window light from the left.”
AI Judge Analysis
FLUX.2 [dev]
- + Perfect adherence to all spatial instructions
- + Realistic glass physics and refractions
- + High-quality photographic texture on the book and table
Stable Diffusion 3.5 Large
- + Crisp, sharp details on the glass edges
- + Vivid colors
- − Failed the spatial reasoning: the book is inside the cube instead of on top
- − The blue sphere appears to be floating unnaturally above the book
- − The plant is mostly above/beside the cube rather than behind it
Verdict: FLUX.2 [dev] followed every instruction perfectly, correctly placing the sphere inside the cube and the book on top. Stable Diffusion 3.5 Large failed the spatial arrangement, placing the book inside the cube and the sphere on top of the book, which was the opposite of what was requested.
Candid Street Photography
Text-to-Image“A candid street photo of an elderly Japanese man repairing a red bicycle in light rain, reflections on wet pavement, shallow depth of field, 50mm lens, natural skin texture, imperfect framing, motion blur from passing cars, cinematic but realistic, no stylization.”
AI Judge Analysis
FLUX.2 [dev]
- + Excellent motion blur from passing cars as requested
- + Highly realistic skin texture and facial features
- + Superior pavement reflections and wet weather atmosphere
- − The bicycle structure is mechanically confusing with overlapping wires and frames
- − The framing is a bit centered despite the 'imperfect framing' prompt
Stable Diffusion 3.5 Large
- + Accurate bicycle anatomy and silhouette
- + Effective shallow depth of field
- + Good color contrast between the red bike and the teal-toned background
- − Lacks the requested motion blur on the vehicles
- − The rain effect looks like a static overlay rather than a natural part of the scene
- − Skin texture is slightly smoothed and less 'natural' than Model A
Verdict: FLUX.2 [dev] significantly outperformed Stable Diffusion 3.5 Large in capturing the 'cinematic but realistic' atmosphere, specifically regarding the complex motion blur of passing traffic and the gritty realism of the subject's skin. While Stable Diffusion 3.5 Large produced a cleaner bicycle, it failed to incorporate the motion blur requested and the rain looked less integrated into the environmental lighting.
Fantasy Warrior
Text-to-Image“Close portrait of a battle-worn paladin in ornate engraved plate armor, hair braided with small beads, faint scars and dirt on the skin, warm torchlight reflecting off metal, shallow depth of field, bokeh sparks, lifelike eyes, highly detailed texture on leather straps and cloth underlayer.”
AI Judge Analysis
FLUX.2 [dev]
- + Excellent adherence to the 'beads' requirement in the hair braids.
- + Very realistic lighting with clear torchlight reflections on the metal and skin.
- + Highly detailed and legible textures on the leather straps and metal engravings.
- − The facial scars look a bit like fresh paint or surface marks rather than deep, healed scars.
- − The hair logic where it transitions into thin braids is slightly messy.
Stable Diffusion 3.5 Large
- + Incredible detail on the chainmail and cloth underlayer texture.
- + The engraving on the armor is more intricate and fine-grained.
- + Very realistic facial skin texture and natural weathering effects.
- − Completely missed the requirement for 'small beads' in the hair.
- − The lighting is flatter and lacks the strong warm 'torchlight' glow requested.
- − The bokeh sparks are barely visible compared to the other model.
Verdict: FLUX.2 [dev] followed the prompt more comprehensively, specifically including the requested hair beads and creating a much stronger atmosphere with warm torchlight and visible bokeh sparks. While Stable Diffusion 3.5 Large has superior fine-detail textures in the armor and cloth, it failed a key descriptor (beads) and lacks the requested lighting drama.
Adorable Baby Animals in Sunny Meadow
Text-to-Image“Hyper-photorealistic scene of fluffy baby animals—a golden retriever puppy, tabby kitten, baby bunny, and red fox kit—with big expressive eyes and ultra-detailed soft fur, playfully chasing butterflies and tumbling together in a lush wildflower meadow, warm golden sunrise light with god rays and dew sparkles, joyful wholesome vibe, 8K masterpiece.”
AI Judge Analysis
FLUX.2 [dev]
- + Excellent fur texture and realistic anatomy for all four animals.
- + Superior lighting with clear god rays and convincing dew sparkles on the grass.
- + Very high level of detail in the foreground flowers and animal eyes.
- − The animals are mostly sitting still rather than 'playfully chasing' and 'tumbling' as requested.
- − Generated two bunnies instead of just one.
Stable Diffusion 3.5 Large
- + Captures the 'playfully chasing' and 'tumbling' movement much better than the other image.
- + Includes one of each requested animal as specified in the count.
- + Strong bokeh effect creates a whimsical, commercial look.
- − The anatomy of the animals is slightly distorted, particularly the puppy's paws and the fox's face.
- − The 'kitten' looks more like a small fox or a hybrid, lacking clear tabby markings.
- − Lower overall texture detail compared to the competitor.
Verdict: FLUX.2 [dev] produces a much more realistic and detailed image with beautiful lighting and textures, though it fails on the specific action of 'tumbling' and adds an extra bunny. Stable Diffusion 3.5 Large captures the dynamic energy of the prompt much better, but suffers from anatomical inconsistencies and less realistic fur rendering. FLUX.2 [dev] is the preferred choice for its sheer visual quality and adherence to the 'hyper-photorealistic' part of the prompt.
Explore each model
Stability AI's 8.1-billion parameter Multimodal Diffusion Transformer (MMDiT) text-to-image model featuring improved image quality, typography, complex prompt understanding, and resource-efficiency