FLUX.2 [dev] Black Forest Labs Z-Image Turbo Alibaba

Settled by community votes across 4 shared challenges, with an AI judge weighing in on each.

FLUX.2 [dev]

24.5 arena score

#17 of 44 in Text-to-Image

Skill signature

Not enough comparable category data

The chart appears once both models have ratings across at least three shared arena categories.

Z-Image Turbo

24.7 arena score

#15 of 44 in Text-to-Image

Vote tally

Where the votes landed

FLUX.2 [dev]

33.3%

win rate

Ties

33.3%

Z-Image Turbo

33.3%

win rate

33.3% 33.3% ties 33.3%

Shared challenges 4

Challenge by challenge

The strongest take from each model on every shared challenge, with the AI judge's read.

Geometric Composition

Text-to-Image

“A glass cube on a wooden table. Inside the cube is a small blue sphere. On top of the cube sits a red book. A green plant is behind the cube, partially visible through the glass. Soft window light from the left.”

FLUX.2 [dev]

Z-Image Turbo

40% wins 20% ties 40% wins

AI Judge Analysis

FLUX.2 [dev]

+ Excellent rendering of thick, realistic glass with appropriate refractive qualities.
+ The plant is clearly visible through the glass as requested.
+ Cinematic lighting that accurately reflects the soft window light from the left.

− The blue sphere appears to be floating slightly above the bottom surface.
− The glass cube has rounded interior corners that make it look more like a vase or container than a geometric cube.

Z-Image Turbo

+ Sharp, precise geometric cube shapes with clean edges.
+ Realistic texture on the red book, including wear and paper detail.
+ The blue sphere is correctly seated on the bottom surface with a reflection.

− The plant in the background is very blurry and barely visible through the glass cube itself.
− The lighting is somewhat flat compared to the atmosphere in Model A.

Verdict: FLUX.2 [dev] produces a more atmospheric and aesthetically pleasing image with superior glass physics and better adherence to the requirement of seeing the plant through the glass. Z-Image Turbo captures the geometric 'cube' shape more accurately and has better book textures, but fails to make the plant significantly visible through the glass medium.

Candid Street Photography

Text-to-Image

“A candid street photo of an elderly Japanese man repairing a red bicycle in light rain, reflections on wet pavement, shallow depth of field, 50mm lens, natural skin texture, imperfect framing, motion blur from passing cars, cinematic but realistic, no stylization.”

FLUX.2 [dev]

Z-Image Turbo

50% wins 50% ties 0% wins

AI Judge Analysis

FLUX.2 [dev]

+ Excellent adherence to the 'motion blur from passing cars' prompt element.
+ Very realistic skin texture and facial features for the elderly man.
+ Superior lighting and reflections on the wet pavement creating a cinematic atmosphere.
+ Highly detailed and realistic bicycle components.

− The bicycle frame geometry becomes slightly nonsensical near the bottom bracket.
− The man's hands are interacting with a complex mess of cables that looks a bit cluttered.

Z-Image Turbo

+ Clearer view of the 'red bicycle' as requested.
+ Good depiction of light rain falling against the background cars.

− Failed to include motion blur for the passing cars, which remain static.
− The skin texture and lighting look flatter and less cinematic than requested.
− The man appears to be just holding the bike rather than repairing it.

Verdict: FLUX.2 [dev] followed the complex prompt requirements much more effectively, specifically capturing the motion blur of traffic and the cinematic wet-weather atmosphere. While Z-Image Turbo produced a clear image, it missed the key stylistic instruction for motion blur and the subject appears to be posing with the bike rather than repairing it.

Fantasy Warrior

Text-to-Image

“Close portrait of a battle-worn paladin in ornate engraved plate armor, hair braided with small beads, faint scars and dirt on the skin, warm torchlight reflecting off metal, shallow depth of field, bokeh sparks, lifelike eyes, highly detailed texture on leather straps and cloth underlayer.”

FLUX.2 [dev]

Z-Image Turbo

0% wins 50% ties 50% wins

AI Judge Analysis

FLUX.2 [dev]

+ Excellent execution of the braided hair with beads as requested.
+ Highly detailed texture on leather straps and metal engravings.
+ Very lifelike eyes with realistic skin texture and scars.

− The torch in the background is slightly blurry and less defined than in Model B.

Z-Image Turbo

+ Strong atmospheric lighting with a well-defined torch and bokeh sparks.
+ Good representation of ornate plate armor and underlayers.
+ Effective use of shallow depth of field for a cinematic look.

− The beads in the hair look more like metallic studs or sequins rather than traditional beads.
− The skin texture and scar details are slightly softer compared to Model A.

Verdict: FLUX.2 [dev] followed the prompt more precisely, particularly regarding the hair braids and beads, and delivered superior texture detail on the leather and skin. Z-Image Turbo produced a beautiful cinematic image with high-quality armor, but it felt slightly more generic in its facial details and interpretation of the beads. FLUX.2 [dev] is the winner for its impressive photorealism and adherence to the fine details of the request.

Adorable Baby Animals in Sunny Meadow

Text-to-Image

“Hyper-photorealistic scene of fluffy baby animals—a golden retriever puppy, tabby kitten, baby bunny, and red fox kit—with big expressive eyes and ultra-detailed soft fur, playfully chasing butterflies and tumbling together in a lush wildflower meadow, warm golden sunrise light with god rays and dew sparkles, joyful wholesome vibe, 8K masterpiece.”

FLUX.2 [dev]

Z-Image Turbo

AI Judge Analysis

FLUX.2 [dev]

+ Excellent anatomical accuracy for all animals.
+ Superior rendering of 'god rays' and atmospheric lighting.
+ Highly detailed fur texture and realistic morning dew sparkles.

− The animals are sitting rather than 'tumbling' as requested in the prompt.
− Included an extra rabbit.

Z-Image Turbo

+ Captures the 'tumbling' and 'playful' motion much better than Model A.
+ Bright, vibrant colors that fit the 'joyful wholesome vibe'.
+ Good adherence to the types of animals requested.

− Noticeable anatomical issues, such as the puppy's paw merging into the rabbit.
− The cat's facial structure and open mouth look slightly distorted and unnatural.
− Lower overall resolution and fine detail compared to the competitor.

Verdict: FLUX.2 [dev] produces a much more technically proficient and realistic image with beautiful lighting and textures, though it is more static in composition. Z-Image Turbo better captures the requested 'tumbling' action, but suffers from significant anatomical merging and less refined details in the fur and faces. FLUX.2 [dev] is the winner for its superior visual quality and realism.

Next steps

Explore each model

FLUX.2 [dev]

Black Forest Labs

Black Forest Labs' open-weights image generation model with frontier performance, available for non-commercial local deployment

Vote this model in the arena

Arena profile Lumenfall catalog

Z-Image Turbo

Alibaba

Tongyi-MAI's 6-billion parameter distilled text-to-image model optimized for speed, achieving high-quality generation in 8 steps or fewer with support for bilingual text rendering

Vote this model in the arena

Arena profile Lumenfall catalog