OpenAI's legacy image generation model supporting generations, edits with masks (inpainting), and variations
Settled by community votes across 1 shared challenge, with an AI judge weighing in on each.
DALL-E 2
#37 of 44 in Text-to-Image
Not enough comparable category data
The chart appears once both models have ratings across at least three shared arena categories.
FLUX.1 Kontext [dev]
#43 of 44 in Text-to-Image
Where the votes landed
DALL-E 2
100.0%
win rate
Ties
0.0%
FLUX.1 Kontext [dev]
0.0%
win rate
Challenge by challenge
The strongest take from each model on every shared challenge, with the AI judge's read.
Magic Burger Explosion: Fiery Photorealism Challenge
Text-to-Image“Ad for 'Magic Burger'. Dynamic, exploded burger with all components (bun, patty, cheese, lettuce, tomato, sauce) suspended in mid-air. Emphasize photorealistic detail and a sense of motion. Dark, fiery background with glowing embers. Integrate text: 'MAGIC BURGER' as a prominent title, 'LIMITED TIME ONLY' as a secondary message, and '€6.99' in a starburst, all rendered with a fiery, glowing effect.”
AI Judge Analysis
DALL-E 2
- + Captures the 'exploded' and 'dynamic' motion requested in the prompt.
- + Has a strong fiery aesthetic with glowing embers integrated into the food.
- − Text rendering is poor with significant spelling errors ('MARGIC', 'BAGUEC').
- − Low visual clarity and poor food detail; components look messy and unappetizing.
- − Failed to include the price or starburst element.
FLUX.1 Kontext [dev]
- + Excellent text rendering including the primary title and starburst price tag.
- + Photorealistic food detail with crisp textures and appealing colors.
- + High-quality composition that looks like a professional commercial advertisement.
- − Failed to create an 'exploded' burger, showing an assembled burger instead.
- − Typo in the secondary text: 'INHL Y' instead of 'ONLY'.
Verdict: DALL-E 2 better understood the requested 'exploded' composition but failed significantly on text and image quality. FLUX.1 Kontext [dev] produced a much more professional and photorealistic advertisement with accurate primary text, though it ignored the instruction to separate the burger components in mid-air.
Explore each model
Black Forest Labs' open-weights multimodal flow transformer for in-context image generation and editing, available for non-commercial use with character consistency and style transfer capabilities