Black Forest Labs' 12-billion parameter flow transformer for high-quality text-to-image generation, suitable for personal and commercial use with streaming support
Settled by community votes across 1 shared challenge, with an AI judge weighing in on each.
FLUX.1 [dev]
#42 of 44 in Text-to-Image
Not enough comparable category data
The chart appears once both models have ratings across at least three shared arena categories.
Z-Image Turbo
#15 of 44 in Text-to-Image
Where the votes landed
FLUX.1 [dev]
0%
win rate
Ties
0%
Z-Image Turbo
0%
win rate
Challenge by challenge
The strongest take from each model on every shared challenge, with the AI judge's read.
The Halloween Invitation
Text-to-Image“Vintage gothic Halloween party invitation. Dark parchment poster, spooky border with webs and thorns, central glowing jack-o-lantern, bats, twisted trees, moody night sky. Add elegant gothic title text saying "Halloween Party Invitation", a small scroll banner saying "You are invited to a night of frights", and event details at the bottom: Date: 30.10.2026 Time: 7pm Location: The Arches, NYC Spooky but polished, cinematic lighting, square format.”
AI Judge Analysis
FLUX.1 [dev]
- + Strong cinematic lighting with a vibrant glowing jack-o-lantern.
- + Elegant illustrative style for the trees and moon.
- + Good adherence to the thorn border requirement.
- − Significant text errors including 'Lalloween Rantcy' and 'You Tre'.
- − Did not include the requested spider web elements.
- − Repetitive text in the time section and a colon error in the date.
Z-Image Turbo
- + Excellent adherence to the 'parchment' and 'gothic title' style.
- + Accurate rendering of the main invitation title.
- + Included both spider webs and thorns as requested.
- − Spelling error in the location ('The Archves' instead of 'The Arches').
- − The 'You are invited' banner text is very small and lacks a scroll graphic.
- − The layout feels a bit cluttered compared to Model A.
Verdict: Z-Image Turbo is the winner because it successfully followed the 'dark parchment' and 'gothic' stylistic instructions while keeping the text mostly legible and accurate. FLUX.1 [dev] produced a more visually striking illustration but failed significantly on the text rendering and more complex layout requirements.
Explore each model
Tongyi-MAI's 6-billion parameter distilled text-to-image model optimized for speed, achieving high-quality generation in 8 steps or fewer with support for bilingual text rendering