Black Forest Labs' open-weights multimodal flow transformer for in-context image generation and editing, available for non-commercial use with character consistency and style transfer capabilities
Settled by community votes across 1 shared challenge, with an AI judge weighing in on each.
FLUX.1 Kontext [dev]
#43 of 44 in Text-to-Image
Not enough comparable category data
The chart appears once both models have ratings across at least three shared arena categories.
Qwen Image Max
#31 of 44 in Text-to-Image
Where the votes landed
FLUX.1 Kontext [dev]
0%
win rate
Ties
0%
Qwen Image Max
0%
win rate
Challenge by challenge
The strongest take from each model on every shared challenge, with the AI judge's read.
Magic Burger Explosion: Fiery Photorealism Challenge
Text-to-Image“Ad for 'Magic Burger'. Dynamic, exploded burger with all components (bun, patty, cheese, lettuce, tomato, sauce) suspended in mid-air. Emphasize photorealistic detail and a sense of motion. Dark, fiery background with glowing embers. Integrate text: 'MAGIC BURGER' as a prominent title, 'LIMITED TIME ONLY' as a secondary message, and '€6.99' in a starburst, all rendered with a fiery, glowing effect.”
AI Judge Analysis
FLUX.1 Kontext [dev]
- + Excellent legibility of the main title text
- + High-quality rendering of the charcoal and flame background
- + Perfectly followed the 'starburst' request for the price
- − Failed the 'exploded' instruction as the burger is fully assembled
- − Spelling error in the secondary text ('LNHLY' instead of 'ONLY')
- − The burger lighting is slightly flat compared to the background
Qwen Image Max
- + Successfully captured the 'exploded'/dynamic suspended motion
- + Perfect text rendering for all requested phrases with no spelling errors
- + Superior integration of the fiery glow effect on the typography
- − The 'starburst' for the price is more of a light flare than a graphic starburst
- − Some sauce and lettuce artifacts appear slightly messy during the 'explosion'
Verdict: Qwen Image Max is the clear winner as it successfully interpreted the 'exploded' requirement and maintained perfect text accuracy. FLUX.1 Kontext [dev] produced a static burger and failed to spell 'ONLY' correctly, whereas Qwen Image Max delivered a dynamic, high-fidelity ad with impressive fiery effects.
Explore each model
The Max series of Tongyi Qwen’s image generation model excels across a wide range of generation tasks. Compared with the Plus series, it significantly reduces the “AI-like” feel in generated images, enhancing their realism. It delivers more lifelike material textures for human subjects, finer and more detailed natural textures, and more visually appealing text rendering.