Black Forest Labs' flagship image generation model delivering state-of-the-art quality with exceptional realism, precision, and consistency for both text-to-image and advanced image editing
Settled by community votes across 1 shared challenge, with an AI judge weighing in on each.
FLUX.2 [max]
#11 of 44 in Text-to-Image
Not enough comparable category data
The chart appears once both models have ratings across at least three shared arena categories.
Wan 2.7
#34 of 44 in Text-to-Image
Where the votes landed
FLUX.2 [max]
0.0%
win rate
Ties
0.0%
Wan 2.7
100.0%
win rate
Challenge by challenge
The strongest take from each model on every shared challenge, with the AI judge's read.
The Capybara Taxi Driver
Text-to-Image“Photorealistic scene inside a yellow New York taxi at night. A capybara is driving, wearing a yellow taxi driver cap and a dark jacket. It has a calm, professional expression and both front paws on the steering wheel. In the back seat sits a human businesswoman in a coat, looking at her phone with a completely normal, bored expression (as if this is just another normal ride). Through the windows you can see the streets of Manhattan at night with blurred lights. Realistic taxi interior, photorealistic, detailed fur and fabric, 35mm lens, night lighting with reflections, shallow depth of field.”
AI Judge Analysis
FLUX.2 [max]
- + Excellent photorealistic lighting and depth of field
- + Natural integration of the capybara's head onto the human-like body
- + Accurate depiction of a modern car interior with realistic dashboard lighting
- − The capybara is wearing gloves, obscuring the requested 'paws on the steering wheel' detail
Wan 2.7
- + High adherence to the 'front paws' requirement
- + Capybara's expression is very calm and fits the prompt well
- + Clear inclusion of both primary subjects with good focus
- − The capybara's fur texture appears slightly plastic or synthetic compared to Model A
- − Perspective issues where the steering wheel appears disconnected from the dashboard
Verdict: FLUX.2 [max] creates a more cinematically realistic image with superior lighting and texture, though it adds gloves to the driver. Wan 2.7 follows the specific 'paws' instruction better but suffers from slightly less realistic fur textures and a more cluttered composition. Overall, FLUX.2 [max] is the preferred image due to its professional photographic quality and better technical execution of the scene.
Explore each model
Alibaba's Wan 2.7 image generation and editing model for text-to-image, reference-guided generation, and instruction-based image edits