Black Forest Labs' distilled 9 billion parameter image generation model with sub-second inference and multi-reference support
Settled by community votes across 2 shared challenges, with an AI judge weighing in on each.
FLUX.2 [klein] 9B
#7 of 23 in Image Editing
Not enough comparable category data
The chart appears once both models have ratings across at least three shared arena categories.
Z-Image Turbo
#15 of 44 in Text-to-Image
Where the votes landed
FLUX.2 [klein] 9B
50.0%
win rate
Ties
0.0%
Z-Image Turbo
50.0%
win rate
Challenge by challenge
The strongest take from each model on every shared challenge, with the AI judge's read.
Magic Burger Explosion: Fiery Photorealism Challenge
Text-to-Image“Ad for 'Magic Burger'. Dynamic, exploded burger with all components (bun, patty, cheese, lettuce, tomato, sauce) suspended in mid-air. Emphasize photorealistic detail and a sense of motion. Dark, fiery background with glowing embers. Integrate text: 'MAGIC BURGER' as a prominent title, 'LIMITED TIME ONLY' as a secondary message, and '€6.99' in a starburst, all rendered with a fiery, glowing effect.”
AI Judge Analysis
FLUX.2 [klein] 9B
- + Excellent text integration with complex fiery textures across all prompted phrases.
- + Highly detailed food textures, especially on the meat patties and fresh lettuce.
- + Dynamic sense of motion with flying sauce and embers that follow a central explosion theme.
- − The 'exploded' effect is slightly conservative, with components still mostly touching.
Z-Image Turbo
- + Strong vertical composition and bright, legible text.
- + Effective use of glowing embers and steam/fire in the background.
- − The burger is not 'exploded' or suspended in mid-air pieces; it is a solid stack floating.
- − The meat patties have an unnatural, slightly repetitive pebbled texture.
- − The lighting on the starburst feels more like a flat graphic than an integrated 3D element.
Verdict: FLUX.2 [klein] 9B is the clear winner as it more accurately follows the 'exploded' and 'dispersed' instruction and handles the complex fiery text textures with much greater detail and photorealism. Z-Image Turbo produces a standard floating burger advertisement, missing the dynamic deconstructed feeling requested by the prompt.
Chalkboard Menu
Text-to-Image“Handwritten-style chalkboard menu in a cozy café, all text rendered in the exact same realistic chalk handwriting style with natural variations in letter size, slight slant, and chalk texture — no printed or digital fonts anywhere on the board. Title at the top in elegant cursive chalk handwriting: ‘TODAY’S SPECIALS – APRIL 30, 2026’. Below it, three menu items also in the same handwritten chalk style: ‘Truffle Mushroom Risotto – $24’, ‘Grilled Octopus with Lemon & Herbs – $28’, ‘Brown Butter Chocolate Chip Cookies – $9’. At the very bottom, smaller text in the identical handwritten chalk style (slightly smaller but still clearly legible with the same handwriting characteristics): ‘All items made fresh daily • Ask about our gluten-free options’. Warm ambient café lighting, visible chalk dust and smudges, realistic handwriting imperfections, no clean printed text anywhere.”
AI Judge Analysis
FLUX.2 [klein] 9B
- + Excellent chalk texture with realistic smudges and ghosting on the board
- + Successfully completed the truncated prompt for 'Brown Butter Chocolate Chip Cookies'
- + Displays a more elegant cursive-leaning style that matches the prompt requirement
- − Includes a spelling error 'Mushrropom'
- − The wooden frame has a slight perspective misalignment with the background café
Z-Image Turbo
- + Features very clear and legible handwriting with good spacing
- + Accurately rendered all requested menu items and prices
- + Includes a clean, centered layout with bullet points for readability
- − Contains a spelling error 'Mustroom'
- − The handwriting style is fairly uniform and lacks the requested 'natural variations in letter size and slight slant'
- − The chalk texture is too clean and looks slightly like a digital chalk font
Verdict: FLUX.2 [klein] 9B is the winner because it captures the authentic 'chalkboard' aesthetic much better, including realistic smudging and a more sophisticated cursive style. While both models had minor spelling errors, FLUX.2 successfully inferred the full name of the cookies despite the prompt being truncated, whereas Z-Image Turbo's output felt a bit more generic and digital.
Explore each model
Tongyi-MAI's 6-billion parameter distilled text-to-image model optimized for speed, achieving high-quality generation in 8 steps or fewer with support for bilingual text rendering