FLUX.2 [dev] Flash vs Z-Image Turbo
Head-to-head across 8 challenges
FLUX.2 [dev] Flash
75.0%
win rate
Ties
0.0%
Z-Image Turbo
25.0%
win rate
Challenge Results
Geometric Composition
Text-to-Image“A glass cube on a wooden table. Inside the cube is a small blue sphere. On top of the cube sits a red book. A green plant is behind the cube, partially visible through the glass. Soft window light from the left.”
AI Judge Analysis
FLUX.2 [dev] Flash
- + Excellent adherence to all prompt elements including the lighting direction.
- + Superior glass rendering with realistic thickness and slight imperfections.
- + Very high level of detail on the book texture and the plant foliage.
- − The plant is slightly more to the side than 'behind', though still visible through the glass.
Z-Image Turbo
- + Correct placement of all objects as requested in the prompt.
- + Clean and simple composition.
- − Noticeable anatomical error on the book where the corners and spine are poorly defined.
- − The plant in the background is very blurry and lacks clear definition.
- − The lighting is flat compared to the requested 'soft window light'.
Verdict: FLUX.2 [dev] Flash significantly outperforms Z-Image Turbo in terms of realism and detail. While both models followed the spatial instructions of the prompt, FLUX.2 produced a much higher quality image with realistic glass physics, detailed textures on the book and plant, and convincing lighting, whereas Z-Image Turbo struggled with the geometry of the book and had a muddy background.
Candid Street Photography
Text-to-Image“A candid street photo of an elderly Japanese man repairing a red bicycle in light rain, reflections on wet pavement, shallow depth of field, 50mm lens, natural skin texture, imperfect framing, motion blur from passing cars, cinematic but realistic, no stylization.”
AI Judge Analysis
FLUX.2 [dev] Flash
- + Excellent adherence to the 'repairing' aspect of the prompt with visible tools and focused interaction.
- + Superb skin texture and realistic, weathered hands.
- + Strong implementation of motion blur on background vehicles and light reflections.
Z-Image Turbo
- + Good candid framing and natural posture of the man.
- + Clean background with appropriate rain effects.
- + Accurate red bicycle color as requested.
- − Does not show the man repairing the bike; he is simply standing over it.
- − The character's hands and face lack the high-fidelity 'natural skin texture' requested.
- − Composition is a bit static compared to the cinematic request.
Verdict: FLUX.2 [dev] Flash is the clear winner as it followed every detail of the prompt, including the motion blur and the specific action of 'repairing' the bicycle with tools on the ground. Z-Image Turbo produced a decent image, but the man is just holding the bike, and the level of photographic detail in the skin and reflections is significantly lower.
Fantasy Warrior
Text-to-Image“Close portrait of a battle-worn paladin in ornate engraved plate armor, hair braided with small beads, faint scars and dirt on the skin, warm torchlight reflecting off metal, shallow depth of field, bokeh sparks, lifelike eyes, highly detailed texture on leather straps and cloth underlayer.”
AI Judge Analysis
FLUX.2 [dev] Flash
- + Excellent adherence to the 'beads in hair' prompt with colorful details
- + Highly detailed engraving on the plate armor
- + Realistic skin texture and lifelike eyes
- − The scars look a bit more like surface blood/paint than actual physical scarring
Z-Image Turbo
- + Strong cinematic lighting and atmosphere
- + Realistic battle-worn appearance with more convincing skin blemishes
- + Intricate detail on the cloth/chainmail layers
- − The beads in the hair are less prominent and lack color variety
- − The engraving on the armor is slightly less crisp than Model A
Verdict: Both models performed exceptionally well on this complex prompt. FLUX.2 [dev] Flash delivered a more direct interpretation of the 'beads' and 'engraving' requests with incredible sharpness, while Z-Image Turbo created a more atmospheric, cinematic shot with superior lighting integration. FLUX.2 [dev] Flash is the winner for its superior texture rendering on the armor and clearer adherence to the specific hair detailing requested.
Chalkboard Menu
Text-to-Image“Handwritten-style chalkboard menu in a cozy café, all text rendered in the exact same realistic chalk handwriting style with natural variations in letter size, slight slant, and chalk texture — no printed or digital fonts anywhere on the board. Title at the top in elegant cursive chalk handwriting: ‘TODAY’S SPECIALS – APRIL 30, 2026’. Below it, three menu items also in the same handwritten chalk style: ‘Truffle Mushroom Risotto – $24’, ‘Grilled Octopus with Lemon & Herbs – $28’, ‘Brown Butter Chocolate Chip Cookies – $9’. At the very bottom, smaller text in the identical handwritten chalk style (slightly smaller but still clearly legible with the same handwriting characteristics): ‘All items made fresh daily • Ask about our gluten-free options’. Warm ambient café lighting, visible chalk dust and smudges, realistic handwriting imperfections, no clean printed text anywhere.”
AI Judge Analysis
FLUX.2 [dev] Flash
- + Excellent chalk texture with realistic smudges and dust on the board.
- + Perfect spelling on all menu items.
- + Beautifully rendered elegant cursive that matches the requested style.
- − The layout at the bottom feels slightly cramped against the edge.
Z-Image Turbo
- + Clear and readable text layout.
- + Good contrast between the board and the chalk.
- − Spelling error: 'Mustroom' instead of 'Mushroom'.
- − The title is not in 'elegant cursive' as requested; it is a basic semi-print style.
- − Missing the chalk texture and natural variations requested, appearing more like a digital font.
Verdict: FLUX.2 [dev] Flash followed the stylistic instructions perfectly, delivering a high-quality cursive title and realistic chalk textures. Z-Image Turbo failed to use the requested cursive style for the title and included a significant spelling error in the menu items.
The Capybara Taxi Driver
Text-to-Image“Photorealistic scene inside a yellow New York taxi at night. A capybara is driving, wearing a yellow taxi driver cap and a dark jacket. It has a calm, professional expression and both front paws on the steering wheel. In the back seat sits a human businesswoman in a coat, looking at her phone with a completely normal, bored expression (as if this is just another normal ride). Through the windows you can see the streets of Manhattan at night with blurred lights. Realistic taxi interior, photorealistic, detailed fur and fabric, 35mm lens, night lighting with reflections, shallow depth of field.”
AI Judge Analysis
FLUX.2 [dev] Flash
- + Excellent photorealism in the capybara's fur and textures.
- + Accurately places the passenger in the back seat as requested.
- + Effective cinematic lighting and blurred city background.
Z-Image Turbo
- + High clarity on the capybara's facial expression.
- + Dynamic composition with a side view of the car.
- − Failed the spatial request by placing the passenger in the front seat instead of the back.
- − Capybara's paws are not correctly on the steering wheel.
- − Lighting feels a bit more artificial compared to Model A.
Verdict: FLUX.2 [dev] Flash followed the instructions more accurately, specifically by placing the passenger in the back seat and having the capybara's paws on the wheel. Z-Image Turbo struggled with the spatial layout, placing the passenger in the front seat and failing to render the paws correctly on the steering wheel.
The Halloween Invitation
Text-to-Image“Vintage gothic Halloween party invitation. Dark parchment poster, spooky border with webs and thorns, central glowing jack-o-lantern, bats, twisted trees, moody night sky. Add elegant gothic title text saying "Halloween Party Invitation", a small scroll banner saying "You are invited to a night of frights", and event details at the bottom: Date: 30.10.2026 Time: 7pm Location: The Arches, NYC Spooky but polished, cinematic lighting, square format.”
AI Judge Analysis
FLUX.2 [dev] Flash
- + Excellent typography rendering with almost no spelling errors
- + Integrated cohesive layout where the border and background feel like a single unit
- + Sophisticated cinematic lighting on the jack-o-lantern
- − Includes some hallucinations in the text like 'Hurk: 0s6ge'
- − Parchment texture is subtle and less paper-like than requested
Z-Image Turbo
- + Strong 'aged paper' aesthetic with burnt parchment edges
- + Very effective use of the scroll banner element throughout the design
- + Clear and legible event details
- − Spelling error in the location text ('Archves' instead of 'Arches')
- − The border of thorns and webs looks a bit detached from the main parchment poster
- − The background trees are partially cut off by the internal frame border
Verdict: FLUX.2 [dev] Flash produces a more professional and integrated illustration with superior lighting and font choices. While Z-Image Turbo captures the 'dark parchment' texture more literally, its spelling error in the location and slightly cluttered overlapping of scrolls makes FLUX.2 [dev] Flash the better overall invitation.
Isometric Miniature Diorama Scenes
Text-to-Image“Create a clear, 45° top-down isometric miniature 3D cartoon scene of Japan's signature dish: sushi, with soft refined textures, realistic PBR materials, gentle lighting, on a small raised diorama base with minimal garnish and plate. Solid light blue background. At top-center: 'JAPAN' in large bold text, 'SUSHI' below it, small flag icon. Perfectly centered, ultra-clean, high-clarity, square format.”
AI Judge Analysis
FLUX.2 [dev] Flash
- + Perfectly followed the flag icon request with the correct Japanese flag.
- + Higher level of detail and realism in the food textures (rice grains and fish marbling).
- + Clean, professional typography that aligns better with the requested layout.
- − The sushi assembly is slightly unconventional (nigiri fish on top of maki-style rolls).
Z-Image Turbo
- + Pleasant, soft 3D cartoon style with consistent lighting.
- + Accurate isometric perspective and centered composition.
- − Incorrectly displayed the Chinese flag for a prompt titled 'JAPAN'.
- − Less detail in the texture of the rice, appearing more like generic white spheres.
- − Text rendering is slightly softer and less 'bold' compared to Model A.
Verdict: FLUX.2 [dev] Flash is the clear winner because it correctly identified and rendered the Japanese flag, whereas Z-Image Turbo rendered the flag of China. FLUX.2 [dev] Flash also provided much better texture work on the sushi, making it look like a high-quality 3D render, while maintaining very clean typography.
Adorable Baby Animals in Sunny Meadow
Text-to-Image“Hyper-photorealistic scene of fluffy baby animals—a golden retriever puppy, tabby kitten, baby bunny, and red fox kit—with big expressive eyes and ultra-detailed soft fur, playfully chasing butterflies and tumbling together in a lush wildflower meadow, warm golden sunrise light with god rays and dew sparkles, joyful wholesome vibe, 8K masterpiece.”
AI Judge Analysis
FLUX.2 [dev] Flash
- + Excellent adherence to the request for four specific types of animals plus butterflies.
- + Superior rendering of textures, particularly the individual strands of fur and the dew drops on grass.
- + Bautiful lighting with clear 'god rays' and a warm sunrise glow that feels atmospheric.
- − Includes an extra animal (a fifth creature that looks like a second bunny/kitten hybrid).
- − The butterflies look slightly pasted on rather than fully integrated into the lighting.
Z-Image Turbo
- + Accurately included one of each requested animal: puppy, kitten, bunny, and fox.
- + Wholesome and joyful expressions on the animals' faces.
- + Good sense of movement and 'tumbling' as requested in the prompt.
- − Lower technical resolution with noticeable blurring on the fur and background.
- − Lighting is a bit washed out and lacks the defined 'god rays' requested.
- − The kitten's face is somewhat distorted and anatomical details are less precise.
Verdict: FLUX.2 [dev] Flash produces a much higher quality image with stunning detail and lighting, though it fails on count by adding an extra animal. Z-Image Turbo captures the composition and animal count perfectly but lacks the '8K masterpiece' resolution and realistic fur textures found in the other model. FLUX.2 [dev] Flash is preferred for its superior aesthetic and adherence to the complex lighting and texture requirements.
FLUX.2 [dev] Flash
Fast distilled version of Black Forest Labs' FLUX.2 [dev] optimized for speed and cost efficiency.
Z-Image Turbo
Tongyi-MAI's 6-billion parameter distilled text-to-image model optimized for speed, achieving high-quality generation in 8 steps or fewer with support for bilingual text rendering