Fast distilled version of Black Forest Labs' FLUX.2 [dev] optimized for speed and cost efficiency.
Settled by community votes across 6 shared challenges, with an AI judge weighing in on each.
FLUX.2 [dev] Flash
#5 of 44 in Text-to-Image
Not enough comparable category data
The chart appears once both models have ratings across at least three shared arena categories.
Qwen Image 2512
#26 of 44 in Text-to-Image
Where the votes landed
FLUX.2 [dev] Flash
50.0%
win rate
Ties
50.0%
Qwen Image 2512
0.0%
win rate
Challenge by challenge
The strongest take from each model on every shared challenge, with the AI judge's read.
Geometric Composition
Text-to-Image“A glass cube on a wooden table. Inside the cube is a small blue sphere. On top of the cube sits a red book. A green plant is behind the cube, partially visible through the glass. Soft window light from the left.”
AI Judge Analysis
FLUX.2 [dev] Flash
- + Excellent photorealistic textures on the wooden table and red book.
- + Highly accurate glass physics, including subtle dust and realistic reflections.
- + Perfect adherence to all spatial requirements of the prompt.
- − The sphere appears more like a translucent marble than a solid sphere, though it remains blue.
Qwen Image 2512
- + Captures all requested elements correctly.
- + Pleasant soft lighting from the left.
- − The glass physics are slightly inconsistent, with the back edge of the cube appearing warped.
- − The green plant is visible through the glass but less naturally integrated than in the other image.
Verdict: Both models followed the prompt perfectly, but FLUX.2 [dev] Flash produced a significantly more photorealistic result with superior textures and lighting. Qwen Image 2512 is a strong contender but has slight distortions in the glass geometry and less detail in the tabletop wood grain.
Candid Street Photography
Text-to-Image“A candid street photo of an elderly Japanese man repairing a red bicycle in light rain, reflections on wet pavement, shallow depth of field, 50mm lens, natural skin texture, imperfect framing, motion blur from passing cars, cinematic but realistic, no stylization.”
AI Judge Analysis
FLUX.2 [dev] Flash
- + Excellent adherence to the 'repairing' action with tools on the ground.
- + Effective motion blur on background vehicles and realistic rain textures.
- + High level of detail in skin texture and aged hands.
Qwen Image 2512
- + Strong cinematic lighting and high-quality bokeh.
- + Good facial anatomy and realistic skin texture.
- + Captures the light rain atmosphere well.
- − The subject is posing/staring at the camera rather than 'repairing' the bicycle.
- − The bicycle's structure has some logical errors around the chain guard and frame.
- − Misses the specific 'motion blur' requested for the background cars.
Verdict: FLUX.2 [dev] Flash followed the prompt much more accurately, showing the man actively engaged in the task of repairing the bike with tools visible on the ground and requested motion blur on the cars. Qwen Image 2512 produced a high-quality portrait, but the subject is simply sitting with the bike and looking at the camera, which misses the 'repairing' and 'candid' aspects of the prompt.
Chalkboard Menu
Text-to-Image“Handwritten-style chalkboard menu in a cozy café, all text rendered in the exact same realistic chalk handwriting style with natural variations in letter size, slight slant, and chalk texture — no printed or digital fonts anywhere on the board. Title at the top in elegant cursive chalk handwriting: ‘TODAY’S SPECIALS – APRIL 30, 2026’. Below it, three menu items also in the same handwritten chalk style: ‘Truffle Mushroom Risotto – $24’, ‘Grilled Octopus with Lemon & Herbs – $28’, ‘Brown Butter Chocolate Chip Cookies – $9’. At the very bottom, smaller text in the identical handwritten chalk style (slightly smaller but still clearly legible with the same handwriting characteristics): ‘All items made fresh daily • Ask about our gluten-free options’. Warm ambient café lighting, visible chalk dust and smudges, realistic handwriting imperfections, no clean printed text anywhere.”
AI Judge Analysis
FLUX.2 [dev] Flash
- + Excellent chalk texture throughout all text
- + Correct spelling for all menu items
- + Natural smudges and board texture add to the realism
- − The handwriting style is a bit inconsistent between the bold title and thinner body text
Qwen Image 2512
- + Very consistent and aesthetically pleasing cursive handwriting style
- + Perfect layout and centering of the text on the board
- + Extremely crisp text rendering
- − Spelling error in 'Risitto' (should be Risotto)
- − The chalk texture feels slightly too clean and digital compared to Model A
Verdict: Both models followed the complex text-heavy prompt impressively well. FLUX.2 [dev] Flash captures the most realistic chalk artifacts and achieves perfect spelling, while Qwen Image 2512 offers a more beautiful calligraphic style and better composition but fails on the spelling of 'Risotto'.
The Capybara Taxi Driver
Text-to-Image“Photorealistic scene inside a yellow New York taxi at night. A capybara is driving, wearing a yellow taxi driver cap and a dark jacket. It has a calm, professional expression and both front paws on the steering wheel. In the back seat sits a human businesswoman in a coat, looking at her phone with a completely normal, bored expression (as if this is just another normal ride). Through the windows you can see the streets of Manhattan at night with blurred lights. Realistic taxi interior, photorealistic, detailed fur and fabric, 35mm lens, night lighting with reflections, shallow depth of field.”
AI Judge Analysis
FLUX.2 [dev] Flash
- + Excellent adherence to the 'bored' expression for the passenger
- + Captures the New York taxi aesthetic well with the 'TAX' sign on top
- + High level of photographic realism and lighting
- − The passenger is sitting in the front passenger seat instead of the 'back seat' as requested
Qwen Image 2512
- + Correctly places the passenger in the back seat as specified in the prompt
- + High quality rendering of the capybara's fur and the taxi driver hat
- + Realistic depth of field and street lighting
- − The passenger's expression looks more 'annoyed' or 'angry' than specifically 'bored'
- − The capybara's paws look slightly more like human hands in gloves than actual paws
Verdict: While both models produced high-quality, professional results, Qwen Image 2512 followed the spatial instructions more accurately by placing the passenger in the back seat. FLUX.2 [dev] Flash captured the 'bored' expression better but failed the positional prompt by putting the businesswoman in the front seat next to the driver.
Isometric Miniature Diorama Scenes
Text-to-Image“Create a clear, 45° top-down isometric miniature 3D cartoon scene of Japan's signature dish: sushi, with soft refined textures, realistic PBR materials, gentle lighting, on a small raised diorama base with minimal garnish and plate. Solid light blue background. At top-center: 'JAPAN' in large bold text, 'SUSHI' below it, small flag icon. Perfectly centered, ultra-clean, high-clarity, square format.”
AI Judge Analysis
FLUX.2 [dev] Flash
- + Perfect text rendering and layout placement.
- + Strong PBR material qualities, especially on the wooden base and fish texture.
- + Very clean and minimalist aesthetic that matches the 'ultra-clean' requirement.
- − The sushi construction is slightly physically illogical (nigiri-style toppings on a roll-style base).
Qwen Image 2512
- + Excellent 'cartoon' diorama style with more variety in sushi types.
- + Creative typography that fits a 3D cartoon theme better than plain black text.
- + Better adherence to the 'miniature 3D cartoon' style requested.
- − The 'JAPAN' text has slight alignment issues with the 'SUSHI' text below it.
- − The lighting is a bit flatter compared to the realistic shadows in Model A.
Verdict: FLUX.2 [dev] Flash produces a cleaner, more realistic image with superior text rendering and high-quality materials. However, Qwen Image 2512 captures the 'cartoon diorama' spirit much better, offering a more interesting scene with varied sushi and a stylized base, even if the text is slightly less polished.
Adorable Baby Animals in Sunny Meadow
Text-to-Image“Hyper-photorealistic scene of fluffy baby animals—a golden retriever puppy, tabby kitten, baby bunny, and red fox kit—with big expressive eyes and ultra-detailed soft fur, playfully chasing butterflies and tumbling together in a lush wildflower meadow, warm golden sunrise light with god rays and dew sparkles, joyful wholesome vibe, 8K masterpiece.”
AI Judge Analysis
FLUX.2 [dev] Flash
- + Excellent depiction of movement with the puppy and kitten pouncing.
- + Stronger adherence to the 'god rays' and 'dew sparkles' requirement with visible water droplets on the grass.
- + Very high fur detail and natural lighting integration.
- − Includes an extra fifth animal (a second bunny) not requested in the prompt.
Qwen Image 2512
- + Perfect adherence to the count and species of animals requested.
- + Centrally balanced and aesthetically pleasing composition.
- + Excellent rendering of the golden retriever's face and fur.
- − Static posing that doesn't fully capture 'playfully chasing' or 'tumbling' as well as Model A.
- − The fox's ears and proportions look slightly off compared to a real kit.
Verdict: FLUX.2 [dev] Flash captures the dynamic action of the prompt much better, showing animals pouncing and playing, but it fails the logic test by adding a fifth animal. Qwen Image 2512 provides a cleaner, more accurate set of animals and a highly professional composition, though it feels more like a static portrait than an action scene. Qwen Image 2512 is the preferred overall winner for following the specific animal list while maintaining high visual quality.
Explore each model
Improved version of Alibaba's Qwen image model with better text rendering, finer natural textures, and more realistic human generation.