FLUX.2 [max] Black Forest Labs Qwen Image 2512 Alibaba

Settled by community votes across 7 shared challenges, with an AI judge weighing in on each.

FLUX.2 [max]

25.9 arena score

#11 of 44 in Text-to-Image

Skill signature

Not enough comparable category data

The chart appears once both models have ratings across at least three shared arena categories.

Qwen Image 2512

22.4 arena score

#26 of 44 in Text-to-Image

Vote tally

Where the votes landed

FLUX.2 [max]

77.8%

win rate

Ties

0.0%

Qwen Image 2512

22.2%

win rate

77.8% 0.0% ties 22.2%

Shared challenges 7

Challenge by challenge

The strongest take from each model on every shared challenge, with the AI judge's read.

Geometric Composition

Text-to-Image

“A glass cube on a wooden table. Inside the cube is a small blue sphere. On top of the cube sits a red book. A green plant is behind the cube, partially visible through the glass. Soft window light from the left.”

FLUX.2 [max]

Qwen Image 2512

67% wins 0% ties 33% wins

AI Judge Analysis

FLUX.2 [max]

+ Excellent photographic realism with a high-end lens feel.
+ Accurate and aesthetically pleasing lighting from the left.
+ Clean, sharp rendering of the glass cube and its reflections.

− The sphere appears more like a glass marble than a simple sphere, adding extra internal reflections.

Qwen Image 2512

+ Good adherence to all prompt elements.
+ Solid composition with clear visibility of the plant through the glass.
+ Realistic texture on the red book cover.

− The glass has a heavy cyan/green tint compared to the neutral transparency of Model A.
− Lighting feels a bit flatter and less dynamic.

Verdict: FLUX.2 [max] produces a much more professional and polished image with superior lighting and reflections. While both models followed the spatial prompt perfectly, FLUX.2 [max] has a cleaner aesthetic and more realistic glass rendering compared to the tinted glass in Qwen Image 2512.

Candid Street Photography

Text-to-Image

“A candid street photo of an elderly Japanese man repairing a red bicycle in light rain, reflections on wet pavement, shallow depth of field, 50mm lens, natural skin texture, imperfect framing, motion blur from passing cars, cinematic but realistic, no stylization.”

FLUX.2 [max]

Qwen Image 2512

AI Judge Analysis

FLUX.2 [max]

+ Excellent adherence to the 'repairing' action, showing the man physically working on the wheel hub.
+ Superior rendering of wet textures and reflections on the pavement.
+ High level of anatomical and mechanical detail, including realistic hand aging and bicycle components.

− The motion blur on the background car is slightly more static than 'passing' cars often look.
− The framing is a bit tight on the subject, though it fits the 'imperfect' request.

Qwen Image 2512

+ Strong bokeh effect and cinematic 50mm lens look.
+ Captures the 'imperfect' and 'candid' feel with a more centered, snapshot-like composition.
+ Good skin texture and expressive facial features.

− The subject is posing with the bike rather than 'repairing' it as requested.
− Noticeable anatomical issues with the left hand resting on the seat.
− The bicycle frame geometry is slightly warped/incoherent near the front.

Verdict: FLUX.2 [max] is the superior model here because it accurately captures the 'repairing' action with high technical fidelity and realistic textures. While Qwen Image 2512 captures a nice mood, the man is simply sitting next to the bike, and the image contains several AI artifacts in the hands and bicycle frame.

Modern Clean Menu

Text-to-Image

“Modern minimalist restaurant menu design, white background with colorful food photos in grid, sections for appetizers/pizza/mains, bold sans-serif fonts, vibrant accents, clean professional layout for casual dining.”

FLUX.2 [max]

Qwen Image 2512

100% wins 0% ties 0% wins

AI Judge Analysis

FLUX.2 [max]

+ Excellent adherence to the grid layout for food photos.
+ Very clean, legible typography and high-quality font rendering.
+ Clear separation of the requested sections: Appetizers, Pizza, and Mains.

− Small icons at the bottom and some descriptive text names are gibberish.
− The 'Appetizers' section contains photos of pizzas and burgers, causing internal logic inconsistency.

Qwen Image 2512

+ Good use of vibrant colors and high-contrast food photography.
+ Dynamic and modern composition with a central header.
+ Follows the requested grid layout for the top half of the design.

− Text is highly distorted and mostly illegible throughout the menu.
− Sections are misspelled or combined incorrectly (e.g., 'Appetiizers', '/Means').
− Overall layout feels more cluttered and less professional than the alternative.

Verdict: FLUX.2 [max] is the superior model because it produces a clean, professional, and highly legible menu layout that closely follows the design prompt. While Qwen Image 2512 has vibrant food photography, its text rendering and spelling are significantly worse, making the menu non-functional as a graphic design piece.

The Capybara Taxi Driver

Text-to-Image

“Photorealistic scene inside a yellow New York taxi at night. A capybara is driving, wearing a yellow taxi driver cap and a dark jacket. It has a calm, professional expression and both front paws on the steering wheel. In the back seat sits a human businesswoman in a coat, looking at her phone with a completely normal, bored expression (as if this is just another normal ride). Through the windows you can see the streets of Manhattan at night with blurred lights. Realistic taxi interior, photorealistic, detailed fur and fabric, 35mm lens, night lighting with reflections, shallow depth of field.”

FLUX.2 [max]

Qwen Image 2512

AI Judge Analysis

FLUX.2 [max]

+ Excellent photorealistic texture on the capybara's fur and the leather jacket
+ Strong composition with a side view that clearly shows both characters and the taxi dashboard
+ Natural lighting and realistic bokeh in the background

− The capybara's hands look more like human gloved hands than paws

Qwen Image 2512

+ The capybara's paws are more anatomically grounded for the animal
+ Captures a very specific 'bored' expression on the passenger as requested
+ Detailed taxi driver uniform hat

− Lighting on the capybara's face feels slightly flat and less integrated with the night environment
− The passenger's hands and phone interaction look slightly warped

Verdict: FLUX.2 [max] wins on overall visual quality and cinematic lighting, creating a very believable photorealistic scene despite the slightly human-like hands. Qwen Image 2512 followed the prompt's character descriptions well, particularly the passenger's expression, but the overall image texture feels less refined than the competition.

Isometric Miniature Diorama Scenes

Text-to-Image

“Create a clear, 45° top-down isometric miniature 3D cartoon scene of Japan's signature dish: sushi, with soft refined textures, realistic PBR materials, gentle lighting, on a small raised diorama base with minimal garnish and plate. Solid light blue background. At top-center: 'JAPAN' in large bold text, 'SUSHI' below it, small flag icon. Perfectly centered, ultra-clean, high-clarity, square format.”

FLUX.2 [max]

Qwen Image 2512

50% wins 0% ties 50% wins

AI Judge Analysis

FLUX.2 [max]

+ Perfect text rendering and layout placement.
+ Matches the 'ultra-clean' and 'minimalist' aesthetic requested.
+ Excellent isometric perspective and lighting.

− The sushi models are slightly more simplified than a 'realistic PBR' material might suggest.

Qwen Image 2512

+ Higher detail in the food textures and materials.
+ Good adherence to the diorama base and text requirements.
+ Playful 3D cartoon style.

− The isometric perspective is slightly lower than the requested 45° angle.
− The base includes extra foliage/garnish not explicitly requested in the 'minimal' prompt.

Verdict: Both models followed the prompt exceptionally well, including the specific text and flag requirements. FLUX.2 [max] produced a cleaner, more professional-looking graphic with a perfect 45-degree isometric view, while Qwen Image 2512 featured more detailed textures on the sushi pieces but deviated slightly from the 'minimal' requirement by adding extra plants. FLUX.2 [max] is the winner for its superior composition and adherence to the 'ultra-clean' aesthetic.

Adorable Baby Animals in Sunny Meadow

Text-to-Image

“Hyper-photorealistic scene of fluffy baby animals—a golden retriever puppy, tabby kitten, baby bunny, and red fox kit—with big expressive eyes and ultra-detailed soft fur, playfully chasing butterflies and tumbling together in a lush wildflower meadow, warm golden sunrise light with god rays and dew sparkles, joyful wholesome vibe, 8K masterpiece.”

FLUX.2 [max]

Qwen Image 2512

100% wins 0% ties 0% wins

AI Judge Analysis

FLUX.2 [max]

+ Captures the active 'chasing' and 'tumbling' motion requested in the prompt.
+ Excellent sense of depth with a soft, atmospheric background and realistic bokeh.
+ Subtle, realistic lighting with dew sparkles and soft god rays that feel natural.

− The kitten's size is a bit small relative to the bunny.
− The 'fox kit' looks slightly more like a small adult fox proportionally.

Qwen Image 2512

+ High level of facial detail and expressive, large eyes for all animals.
+ Strong color saturation and clear butterfly silhouettes.
+ Vibrant, golden 'god rays' that create a very warm, bright atmosphere.

− Static, posed composition fails to depict the 'chasing' and 'tumbling' actions requested.
− Noticeable anatomical issues, specifically the puppy's paws merging awkwardly with the other animals.
− The fox's face lacks the 'kit' features, appearing more like a compressed adult fox face.

Verdict: FLUX.2 [max] followed the prompt more closely by illustrating the animals in motion (chasing/tumbling), creating a more dynamic and believable scene. Qwen Image 2512 opted for a static, posed portrait which, while cute, ignored the behavioral aspects of the prompt and contained several anatomical clipping artifacts where the animals overlap. FLUX.2 [max] is the winner for its better composition and superior adherence to the requested action.

Vintage Cafe Logo

Text-to-Image

“Vintage minimalist restaurant logo for "Caffè Florian", retro cloche dome with steam and "Est. 1720" banner, classic typography, warm brown and cream tones, subtle texture on light background, vector emblem style.”

FLUX.2 [max]

Qwen Image 2512

100% wins 0% ties 0% wins

AI Judge Analysis

FLUX.2 [max]

+ Perfect adherence to the 'minimalist' and 'vector emblem' style requested.
+ Accurate rendering of the 'Caffè Florian' name with the correct grave accent.
+ Clean, professional composition suitable for a real-world logo.

− The steam is very faint and simple compared to the rest of the illustration.

Qwen Image 2512

+ Excellent illustrative detail and cross-hatching texture.
+ Strong vintage aesthetic with a dynamic banner design.
+ Correct typography and placement of the 'Est. 1720' text.

− Ignored the 'minimalist' and 'vector' part of the prompt in favor of a detailed illustration.
− The steam is a bit heavy-handed for a clean logo design.

Verdict: FLUX.2 [max] followed the prompt more accurately by providing a minimalist vector emblem that looks like a functional logo. Qwen Image 2512 created a beautiful vintage illustration, but it is too complex for the 'minimalist vector' requirement and lacked the refined simplicity of a professional logo.

Next steps

Explore each model

FLUX.2 [max]

Black Forest Labs

Black Forest Labs' flagship image generation model delivering state-of-the-art quality with exceptional realism, precision, and consistency for both text-to-image and advanced image editing

Vote this model in the arena

Arena profile Lumenfall catalog

Qwen Image 2512

Alibaba

Improved version of Alibaba's Qwen image model with better text rendering, finer natural textures, and more realistic human generation.

Vote this model in the arena

Arena profile Lumenfall catalog