OpenAI's legacy image generation model supporting generations, edits with masks (inpainting), and variations
Settled by community votes across 2 shared challenges, with an AI judge weighing in on each.
DALL-E 2
#37 of 44 in Text-to-Image
Not enough comparable category data
The chart appears once both models have ratings across at least three shared arena categories.
Stable Diffusion 3.5 Large Turbo
#44 of 44 in Text-to-Image
Where the votes landed
DALL-E 2
100.0%
win rate
Ties
0.0%
Stable Diffusion 3.5 Large Turbo
0.0%
win rate
Challenge by challenge
The strongest take from each model on every shared challenge, with the AI judge's read.
Magic Burger Explosion: Fiery Photorealism Challenge
Text-to-Image“Ad for 'Magic Burger'. Dynamic, exploded burger with all components (bun, patty, cheese, lettuce, tomato, sauce) suspended in mid-air. Emphasize photorealistic detail and a sense of motion. Dark, fiery background with glowing embers. Integrate text: 'MAGIC BURGER' as a prominent title, 'LIMITED TIME ONLY' as a secondary message, and '€6.99' in a starburst, all rendered with a fiery, glowing effect.”
AI Judge Analysis
DALL-E 2
- + Successfully captures the 'exploded' and 'suspended' effect for most components.
- + Creates a cohesive fiery atmosphere around the food.
- − Nonsensical text rendering with misspellings like 'MARGIC'.
- − Low image resolution and lack of photorealistic detail.
- − Failed to include the price starburst and secondary messaging.
Stable Diffusion 3.5 Large Turbo
- + High visual clarity and sharp, crisp rendering.
- + Excellent fire and smoke effects that match the requested background.
- − Failed to follow the 'exploded' instruction, showing a stacked burger instead.
- − Completely missed all text requirements including the title and price.
- − Image feels more illustrative/CG than photorealistic.
Verdict: Both models failed significantly on the specific text and layout requirements. DALL-E 2 attempted the 'exploded' view and some text, but the results are low-quality and riddled with spelling errors. Stable Diffusion 3.5 Large Turbo produced a much cleaner and more professional image, but it ignored the 'exploded' instruction and failed to include any text at all.
Chalkboard Menu
Text-to-Image“Handwritten-style chalkboard menu in a cozy café, all text rendered in the exact same realistic chalk handwriting style with natural variations in letter size, slight slant, and chalk texture — no printed or digital fonts anywhere on the board. Title at the top in elegant cursive chalk handwriting: ‘TODAY’S SPECIALS – APRIL 30, 2026’. Below it, three menu items also in the same handwritten chalk style: ‘Truffle Mushroom Risotto – $24’, ‘Grilled Octopus with Lemon & Herbs – $28’, ‘Brown Butter Chocolate Chip Cookies – $9’. At the very bottom, smaller text in the identical handwritten chalk style (slightly smaller but still clearly legible with the same handwriting characteristics): ‘All items made fresh daily • Ask about our gluten-free options’. Warm ambient café lighting, visible chalk dust and smudges, realistic handwriting imperfections, no clean printed text anywhere.”
AI Judge Analysis
DALL-E 2
- + The text has a convincing chalk-on-board texture.
- + It mimics organic handwriting better than the other model.
- − The text is completely illegible gibberish.
- − The composition is a messy close-up that ignores the 'cozy café' setting.
- − It failed to include the specific requested dates and prices.
Stable Diffusion 3.5 Large Turbo
- + Sets the scene effectively with a cozy café environment and lighting.
- + The text is much more legible and attempts almost all the specific words in the prompt.
- + Follows the requested layout and composition.
- − The 'handwriting' looks more like a digital brush font than authentic chalk.
- − Contains several spelling errors and incomplete dates (e.g., 'apri 3l 20-').
- − The three-column layout was not requested in the prompt.
Verdict: Stable Diffusion 3.5 Large Turbo is the clear winner as it successfully rendered a cozy café scene and legible text that closely follows the prompt items. DALL-E 2 failed significantly, producing garbled text that is entirely unreadable and lacks the requested context. While Stable Diffusion 3.5 has some spelling errors and the text looks a bit too 'clean' for real chalk, it is much closer to the user's requirements.
Explore each model
Distilled version of SD 3.5 Large that generates high-quality images in just 4 steps, offering faster inference and reduced costs