Black Forest Labs' 12-billion parameter flow transformer for high-quality text-to-image generation, suitable for personal and commercial use with streaming support
Settled by community votes across 1 shared challenge, with an AI judge weighing in on each.
FLUX.1 [dev]
#42 of 44 in Text-to-Image
Not enough comparable category data
The chart appears once both models have ratings across at least three shared arena categories.
Stable Diffusion 3.5 Medium
#41 of 44 in Text-to-Image
Where the votes landed
FLUX.1 [dev]
0%
win rate
Ties
0%
Stable Diffusion 3.5 Medium
0%
win rate
Challenge by challenge
The strongest take from each model on every shared challenge, with the AI judge's read.
The Halloween Invitation
Text-to-Image“Vintage gothic Halloween party invitation. Dark parchment poster, spooky border with webs and thorns, central glowing jack-o-lantern, bats, twisted trees, moody night sky. Add elegant gothic title text saying "Halloween Party Invitation", a small scroll banner saying "You are invited to a night of frights", and event details at the bottom: Date: 30.10.2026 Time: 7pm Location: The Arches, NYC Spooky but polished, cinematic lighting, square format.”
AI Judge Analysis
FLUX.1 [dev]
- + Strong prompt adherence for specific text strings
- + Clean, professional graphic design aesthetic with consistent lighting
- + Near-perfect spelling of the date, time, and location details
- − The main title text is corrupted into 'Lalloween Rantcl'
- − Lacks the requested parchment texture, appearing more like a modern digital poster
Stable Diffusion 3.5 Medium
- + Excellent 'vintage parchment' texture and gothic atmosphere
- + Includes both webs and thorns as requested in the border
- − Significant spelling errors throughout all text fields
- − Poor composition with text overlapping the parchment edges
- − The year is incorrectly rendered as '226' instead of '2026'
Verdict: FLUX.1 [dev] followed the layout and text instructions much more accurately, producing a usable invitation design despite a small error in the main title. Stable Diffusion 3.5 Medium captured the vintage parchment aesthetic better but failed significantly on text rendering, spelling, and character spacing.
Explore each model
Stability AI's 2.5-billion parameter Multimodal Diffusion Transformer with improvements (MMDiT-X) text-to-image model optimized for consumer hardware, featuring improved image quality, typography, and complex prompt understanding