OpenAI's legacy image generation model supporting generations, edits with masks (inpainting), and variations
Settled by community votes across 13 shared challenges, with an AI judge weighing in on each.
DALL-E 2
#37 of 44 in Text-to-Image
Not enough comparable category data
The chart appears once both models have ratings across at least three shared arena categories.
Recraft V4
#8 of 44 in Text-to-Image
Where the votes landed
DALL-E 2
0.0%
win rate
Ties
33.3%
Recraft V4
66.7%
win rate
Challenge by challenge
The strongest take from each model on every shared challenge, with the AI judge's read.
Geometric Composition
Text-to-Image“A glass cube on a wooden table. Inside the cube is a small blue sphere. On top of the cube sits a red book. A green plant is behind the cube, partially visible through the glass. Soft window light from the left.”
AI Judge Analysis
DALL-E 2
- + Features a small cube-like object on a wooden surface.
- + Correctly incorporates a blue color element and a red color element.
- − Failed to place the sphere inside the cube; it appears as a red square inside a translucent block.
- − The plant is in a blue pot rather than being behind and visible through the glass.
- − Missing the red book sitting on top of the cube.
Recraft V4
- + Excellent prompt adherence, correctly placing each object in the specified spatial relationship.
- + High visual quality with realistic textures on the book, wood, and glass.
- + Accurate lighting coming from the window on the left as requested.
- − The blue sphere appears to be floating mid-air inside the cube rather than resting on the bottom.
- − The book is slightly larger than the cube, creating a minor balance issue.
Verdict: Recraft V4 successfully followed every detail of the complex spatial prompt, including the red book on top and the plant visible through the glass. DALL-E 2 failed to render most of the requested objects correctly, confusing the colors and missing the book entirely. Recraft V4 is the clear winner for its superior composition, realism, and logical consistency.
Candid Street Photography
Text-to-Image“A candid street photo of an elderly Japanese man repairing a red bicycle in light rain, reflections on wet pavement, shallow depth of field, 50mm lens, natural skin texture, imperfect framing, motion blur from passing cars, cinematic but realistic, no stylization.”
AI Judge Analysis
DALL-E 2
- + Successfully captured heavy bokeh and imperfect framing
- + Accurate reflections on the wet pavement
- − The subject is completely unrecognizable as an elderly Japanese man
- − Too blurry to distinguish any detail or 'natural skin texture' as requested
- − Composition is confusing and lacks a clear focal point
Recraft V4
- + Excellent adherence to all prompt elements including the subject's ethnicity and age
- + Perfect execution of motion blur from passing cars and rain effects
- + High visual quality with realistic skin textures and lighting
- − Slightly more 'staged' than 'candid' in its clean composition
- − The bicycle fork/tire connection has some minor anatomical inconsistencies
Verdict: Recraft V4 followed the complex prompt instructions perfectly, delivering a cinematic and detailed image that included the specific subject, motion blur, and rain. DALL-E 2 produced a highly abstracted and blurry image that failed to depict the elderly man or the specific scene requested in a meaningful way.
Fantasy Warrior
Text-to-Image“Close portrait of a battle-worn paladin in ornate engraved plate armor, hair braided with small beads, faint scars and dirt on the skin, warm torchlight reflecting off metal, shallow depth of field, bokeh sparks, lifelike eyes, highly detailed texture on leather straps and cloth underlayer.”
AI Judge Analysis
DALL-E 2
- + Successfully captures a gritty, battle-worn texture on the metallic surfaces.
- − The image is highly distorted and lacks coherent anatomy or facial features.
- − It fails to follow most of the prompt instructions, including braided hair and lifelike eyes.
- − Low visual quality with significant artifacts and a messy composition.
Recraft V4
- + Excellent adherence to all prompt details including braided hair with beads and ornate engraving.
- + High visual clarity with realistic skin textures, scars, and dirt.
- + Effective use of lighting and bokeh sparks to create a cinematic atmosphere.
- − The hair braids appear slightly disjointed where they meet the armor.
Verdict: Recraft V4 followed every detail of the prompt with exceptional clarity, delivering a cinematic and lifelike character portrait. In contrast, DALL-E 2 produced an abstract, distorted mess that was largely unrecognizable and failed to include basic elements like eyes or distinct braids.
Modern Clean Menu
Text-to-Image“Modern minimalist restaurant menu design, white background with colorful food photos in grid, sections for appetizers/pizza/mains, bold sans-serif fonts, vibrant accents, clean professional layout for casual dining.”
AI Judge Analysis
DALL-E 2
- + Strong bold sans-serif typography for headings
- + Creative use of geometric shapes to frame food elements
- − Text is illegible gibberish
- − Food images are distorted and unrecognizable
- − Does not follow the requested sections for Appetizers/Pizza/Mains
Recraft V4
- + Excellent adherence to all prompt instructions, including specific categories
- + Perfectly legible text including menu items, descriptions, and prices
- + Clean, high-quality food photography in an organized grid
- − Composition is somewhat utilitarian and basic
- − Visual style is slightly more corporate than 'vibrant'
Verdict: Recraft V4 far exceeds DALL-E 2 by providing a functional, professional menu with perfectly legible text and clear sections as requested. DALL-E 2 fails the prompt by producing distorted imagery and nonsensical text that does not function as a menu.
Magic Burger Explosion: Fiery Photorealism Challenge
Text-to-Image“Ad for 'Magic Burger'. Dynamic, exploded burger with all components (bun, patty, cheese, lettuce, tomato, sauce) suspended in mid-air. Emphasize photorealistic detail and a sense of motion. Dark, fiery background with glowing embers. Integrate text: 'MAGIC BURGER' as a prominent title, 'LIMITED TIME ONLY' as a secondary message, and '€6.99' in a starburst, all rendered with a fiery, glowing effect.”
AI Judge Analysis
DALL-E 2
- + Captures the glowing, fiery atmosphere well.
- + Good sense of energy and motion in the composition.
- − Text is heavily garbled and misspelled.
- − Low photorealistic detail; looks more like a digital painting or illustration.
- − Missing key required text elements and price details.
Recraft V4
- + Excellent text rendering with no spelling errors.
- + High photorealistic detail in the food textures like the patty and lettuce.
- + Followed all prompt instructions including the price starburst and background effects.
- − The white plate at the bottom is slightly distracting from the 'suspended' effect.
- − The lighting on the burger is a bit flat compared to the fiery background.
Verdict: Recraft V4 is the clear winner as it successfully rendered all text elements accurately and maintained high photorealistic quality. DALL-E 2 failed to produce legible text and the burger components lack the clarity and detail found in the Recraft V4 output.
Chalkboard Menu
Text-to-Image“Handwritten-style chalkboard menu in a cozy café, all text rendered in the exact same realistic chalk handwriting style with natural variations in letter size, slight slant, and chalk texture — no printed or digital fonts anywhere on the board. Title at the top in elegant cursive chalk handwriting: ‘TODAY’S SPECIALS – APRIL 30, 2026’. Below it, three menu items also in the same handwritten chalk style: ‘Truffle Mushroom Risotto – $24’, ‘Grilled Octopus with Lemon & Herbs – $28’, ‘Brown Butter Chocolate Chip Cookies – $9’. At the very bottom, smaller text in the identical handwritten chalk style (slightly smaller but still clearly legible with the same handwriting characteristics): ‘All items made fresh daily • Ask about our gluten-free options’. Warm ambient café lighting, visible chalk dust and smudges, realistic handwriting imperfections, no clean printed text anywhere.”
AI Judge Analysis
DALL-E 2
- + Handwritten chalk texture is visible.
- − Text is complete gibberish and fails all prompt requirements.
- − Poor resolution and lack of context.
- − Failed to render any of the specific requested menu items or dates.
Recraft V4
- + Perfect text rendering of the complex prompt requirements.
- + Exceptional visual quality and realistic café environment.
- + Followed all instructions including the specific date and menu prices.
- − The font is slightly too uniform to be truly 'handwritten', leaning toward a digital chalk-style font.
- − Missed the 'elegant cursive' requirement for the title, using a print-style script instead.
Verdict: Recraft V4 is the clear winner as it successfully rendered every piece of text and menu item requested within a high-quality, coherent scene. DALL-E 2 produced an unusable image containing only illegible scribbles that bear no resemblance to the prompt's content.
The Reversed Rodeo
Text-to-Image“Horse riding astronaut in space — horse on top, not vice versa. Surreal, highly detailed, cinematic.”
AI Judge Analysis
DALL-E 2
- + Matches the specific logic of the prompt where the horse is on top of the astronaut.
- − Lower visual fidelity with visible noise and grain.
- − Anatomical distortions in the horse's legs and the astronaut's body.
Recraft V4
- + Excellent high-resolution rendering and cinematic lighting.
- + Strong composition with dynamic asteroid elements and clear details.
- − Failed the core prompt instruction of 'horse on top, not vice versa'.
- − Standard cliche interpretation instead of the requested surreal reversal.
Verdict: This is a case of prompt adherence versus visual quality. DALL-E 2 successfully followed the difficult surreal instruction of placing the horse on top of the astronaut, whereas Recraft V4 ignored the specific constraint and produced a standard (though high-quality) astronaut on a horse. Because the prompt specifically emphasized 'horse on top, not vice versa', DALL-E 2 is the technical winner for prompt adherence despite its age and lower resolution.
The Capybara Taxi Driver
Text-to-Image“Photorealistic scene inside a yellow New York taxi at night. A capybara is driving, wearing a yellow taxi driver cap and a dark jacket. It has a calm, professional expression and both front paws on the steering wheel. In the back seat sits a human businesswoman in a coat, looking at her phone with a completely normal, bored expression (as if this is just another normal ride). Through the windows you can see the streets of Manhattan at night with blurred lights. Realistic taxi interior, photorealistic, detailed fur and fabric, 35mm lens, night lighting with reflections, shallow depth of field.”
AI Judge Analysis
DALL-E 2
- − Completely failed to follow the prompt.
- − Generated an image of a black handbag instead of a taxi scene.
- − Irrelevant output.
Recraft V4
- + Excellent adherence to all prompt details including the capybara's clothing and expression.
- + High visual quality with realistic lighting and textures.
- + Accurately rendered the background of Manhattan and the bored facial expression of the passenger.
- − The transition between the capybara's fur and the jacket collar is slightly blurred.
Verdict: DALL-E 2 provided a completely irrelevant image of a black handbag, failing every aspect of the prompt. Recraft V4, on the other hand, followed the prompt perfectly, delivering a high-quality, photorealistic image that captured the specific characters, setting, and mood requested.
The Halloween Invitation
Text-to-Image“Vintage gothic Halloween party invitation. Dark parchment poster, spooky border with webs and thorns, central glowing jack-o-lantern, bats, twisted trees, moody night sky. Add elegant gothic title text saying "Halloween Party Invitation", a small scroll banner saying "You are invited to a night of frights", and event details at the bottom: Date: 30.10.2026 Time: 7pm Location: The Arches, NYC Spooky but polished, cinematic lighting, square format.”
AI Judge Analysis
DALL-E 2
- + Captures a strong hand-drawn vintage aesthetic
- + Expressive gothic-style lettering
- − Text is largely illegible and fails to follow the specific instructions
- − Missing key visual elements like the central jack-o-lantern and the web/thorn border
- − Low resolution with significant AI artifacts
Recraft V4
- + Flawless adherence to all text requirements including date, time, and location
- + Highly detailed visual elements like the glowing jack-o-lantern, webs, and bats
- + Professional cinematic lighting and composition
- − The 'scroll banner' is a bit small and simplistic compared to the rest of the design
Verdict: Recraft V4 is the clear winner as it followed every instruction perfectly, rendering precise text and all requested visual elements with high clarity. DALL-E 2 produced a stylized but unusable image that failed to include the specific details and readable text required for an invitation.
Isometric Miniature Diorama Scenes
Text-to-Image“Create a clear, 45° top-down isometric miniature 3D cartoon scene of Japan's signature dish: sushi, with soft refined textures, realistic PBR materials, gentle lighting, on a small raised diorama base with minimal garnish and plate. Solid light blue background. At top-center: 'JAPAN' in large bold text, 'SUSHI' below it, small flag icon. Perfectly centered, ultra-clean, high-clarity, square format.”
AI Judge Analysis
DALL-E 2
- + Clean isometric perspective
- + Vibrant colors on a solid blue background
- − Failed to render all requested text, showing only partial 'Sush'
- − Unrecognizable and unappealing sushi shapes
- − Missing Japan flag icon
Recraft V4
- + Perfect adherence to text and icon requirements
- + Excellent realistic PBR materials for the sushi and ice
- + Beautiful diorama-style composition with high clarity
- − None notable
Verdict: Recraft V4 followed every instruction perfectly, including complex text rendering, the flag icon, and the specific diorama request. DALL-E 2 failed on text, iconography, and the basic visual quality of the food items.
Adorable Baby Animals in Sunny Meadow
Text-to-Image“Hyper-photorealistic scene of fluffy baby animals—a golden retriever puppy, tabby kitten, baby bunny, and red fox kit—with big expressive eyes and ultra-detailed soft fur, playfully chasing butterflies and tumbling together in a lush wildflower meadow, warm golden sunrise light with god rays and dew sparkles, joyful wholesome vibe, 8K masterpiece.”
AI Judge Analysis
DALL-E 2
- + Captures a sense of dynamic movement with the puppy's pose.
- + Includes a butterfly and some foliage as requested.
- − Serious anatomical distortions and 'melting' on the smaller animals.
- − Low resolution with significant AI artifacts throughout the image.
- − Missing the specific red fox and bunny as distinct, recognizable creatures.
Recraft V4
- + Excellent adherence to the prompt, including all four specific animals.
- + Superior visual quality with ultra-detailed fur and clear, expressive eyes.
- + Beautiful lighting with visible god rays and dew sparkles in a lush meadow.
- − The kitten's tail area is slightly ambiguous in the huddle.
- − The large butterfly in the extreme bottom right foreground is slightly blurred.
Verdict: Recraft V4 followed every detail of the prompt, successfully rendering a golden retriever, tabby kitten, bunny, and fox kit with high clarity and beautiful lighting. DALL-E 2 struggled significantly with the composition, producing distorted, unrecognizable shapes for the smaller animals and failing the 'hyper-photorealistic' requirement.
Vintage Cafe Logo
Text-to-Image“Vintage minimalist restaurant logo for "Caffè Florian", retro cloche dome with steam and "Est. 1720" banner, classic typography, warm brown and cream tones, subtle texture on light background, vector emblem style.”
AI Judge Analysis
DALL-E 2
- + Matches the warm brown and cream color palette
- + Includes a cloche icon
- − Text is complete gibberish and does not follow the prompt
- − Execution is messy with low-quality, distorted edges
- − Lacks the requested banner element
Recraft V4
- + Perfect text rendering for both the name and the date
- + High-quality vector aesthetic with professional shading
- + Excellent typography that matches the vintage theme
- − The 'Est. 1720' is in a semi-circle rather than a traditional ribbon banner
- − Slight misalignment on the top steam line
Verdict: Recraft V4 is the clear winner as it successfully rendered the requested text 'Caffè Florian' and 'Est. 1720' with high-quality typography. DALL-E 2 failed significantly on text rendering, producing illegible characters and a much lower quality graphic overall.
Apollo 11: Journey to Tranquility
Text-to-Image“Create a clean, modern vector infographic poster about the Apollo 11 mission. NASA-inspired palette (navy, white, muted red, light gray). Flat-vector style, crisp lines, consistent iconography, subtle gradients only. Steps (stop at landing): 1. Launch (Saturn Vicon) 2. Earth Orbit (Earth + orbit ring icon) 3. Translunar (trajectory arc icon) 4. Lunar Orbit (Moon + orbit ring icon) 5. Descent (lunar module descending icon) 6. Landing (lunar module on the surface icon) Small supporting elements (minimal text): • Crew strip: three silhouette icons with only last names: Armstrong, Aldrin, Collins. • Landing site marker: Moon pin labeled "Tranquility" only. Layout constraints: generous margins, large readable labels, clean background with subtle stars. Vector-only, print-poster look, high resolution.”
AI Judge Analysis
DALL-E 2
- + Captures a technical, blueprint-like aesthetic
- + Uses the requested color palette effectively
- − Text is nonsensical and contains major spelling errors
- − Fails to show a clear step-by-step progression
- − Composition is cluttered and chaotic
Recraft V4
- + Perfectly follows all six steps of the instruction with matching icons
- + Excellent text rendering and legibility
- + Clean, modern flat-vector style is professionally executed
- − Composition is a bit sparse with significant white space
Verdict: Recraft V4 is the clear winner as it directly followed every specific instruction, including the six-step sequence and icon requirements, with perfect text legibility. DALL-E 2 produced a chaotic image with gibberish text ('ALLPOO APPLOO') and failed to represent the logic of the mission steps.
Explore each model
Recraft's latest text-to-image generation model with high-quality output, supporting various aspect ratios and custom color palettes