Head to head
Esc

Models · slot A

to navigate to pick

DALL-E 3 OpenAI Grok Imagine Image Pro xAI

Settled by community votes across 11 shared challenges, with an AI judge weighing in on each.

DALL-E 3

18.5 arena score

#35 of 44 in Text-to-Image

Skill signature

Not enough comparable category data

The chart appears once both models have ratings across at least three shared arena categories.

Grok Imagine Image Pro

24.8 arena score

#14 of 44 in Text-to-Image

Vote tally

Where the votes landed

DALL-E 3

0%

win rate

Ties

0%

Grok Imagine Image Pro

0%

win rate

Shared challenges 11

Challenge by challenge

The strongest take from each model on every shared challenge, with the AI judge's read.

Geometric Composition

Text-to-Image

“A glass cube on a wooden table. Inside the cube is a small blue sphere. On top of the cube sits a red book. A green plant is behind the cube, partially visible through the glass. Soft window light from the left.”

DALL-E 3
Grok Imagine Image Pro

AI Judge Analysis

DALL-E 3

  • + Excellent high-detail rendering of wood grain and glass texture
  • + Cinematic lighting and shadows
  • Failed the spatial instructions by placing the book inside and the sphere on top of the book
  • The 'glass cube' is more of a display case with a heavy wood frame
  • The sphere contains a complex scene not requested in the prompt

Grok Imagine Image Pro

  • + Followed all spatial instructions perfectly including objects inside and on top
  • + Realistic lighting and photographic composition
  • + Accurately rendered a plant behind the cube visible through the glass
  • The text on the book spine is slightly shaky but legible
  • Reflections in the glass panels are slightly inconsistent with the sphere's actual position

Verdict: Grok Imagine Image Pro followed the complex spatial instructions of the prompt perfectly, placing each object exactly where requested. DALL-E 3 failed the prompt significantly by reversing the positions of the objects and creating a heavy display case instead of a simple glass cube.

Candid Street Photography

Text-to-Image

“A candid street photo of an elderly Japanese man repairing a red bicycle in light rain, reflections on wet pavement, shallow depth of field, 50mm lens, natural skin texture, imperfect framing, motion blur from passing cars, cinematic but realistic, no stylization.”

DALL-E 3
Grok Imagine Image Pro

AI Judge Analysis

DALL-E 3

  • + Strong cinematic atmosphere with beautiful reflections and lighting
  • + Creative use of foreground elements to simulate 'imperfect framing'
  • + Captures the 'light rain' and wet pavement aesthetic very effectively
  • Anatomical issues with the man's feet and proportions
  • The bicycle has structural inconsistencies in the frame and pedals
  • The man appears to be barefoot in the rain, which feels less realistic for the setting

Grok Imagine Image Pro

  • + Excellent realism in skin texture and clothing materials
  • + Accurate depiction of a person using a wrench on a bicycle
  • + Good execution of motion blur on passing vehicles while keeping the subject sharp
  • The 'imperfect framing' prompt is less apparent as the subject is relatively centered
  • Reflections on the pavement are slightly less pronounced than requested

Verdict: Grok Imagine Image Pro produces a much more grounded and realistic image with superior anatomical accuracy and believable details, such as the man wearing appropriate shoes and using a real tool. While DALL-E 3 captures a more 'cinematic' mood with striking reflections and bokeh, it suffers from significant AI artifacts in the man's feet and the bike's construction.

Fantasy Warrior

Text-to-Image

“Close portrait of a battle-worn paladin in ornate engraved plate armor, hair braided with small beads, faint scars and dirt on the skin, warm torchlight reflecting off metal, shallow depth of field, bokeh sparks, lifelike eyes, highly detailed texture on leather straps and cloth underlayer.”

DALL-E 3
Grok Imagine Image Pro

AI Judge Analysis

DALL-E 3

  • + Excellent warm lighting and bokeh spark effects
  • + Highly intricate engraving textures on the armor
  • + Vibrant, cinematic color palette
  • The helmet design looks slightly busy and physically impractical
  • Skin texture appears a bit smoothed over or 'airbrushed'

Grok Imagine Image Pro

  • + Very realistic skin texture with lifelike dirt and scars
  • + Clever addition of legible Latin text on the collar piece
  • + Distinct braiding and beadwork that perfectly matches the prompt
  • Background fire is a bit distracting compared to the requested bokeh
  • The transition between the hair and the sparks is slightly harsh

Verdict: Both models followed the prompt exceptionally well, but Grok Imagine Image Pro produced a more grounded and realistic interpretation with superior skin texture and clear details like the Latin text. DALL-E 3 created a more stylized, cinematic image with beautiful lighting, though it looks more like a 3D render than a lifelike photograph.

Modern Clean Menu

Text-to-Image

“Modern minimalist restaurant menu design, white background with colorful food photos in grid, sections for appetizers/pizza/mains, bold sans-serif fonts, vibrant accents, clean professional layout for casual dining.”

DALL-E 3
Grok Imagine Image Pro

AI Judge Analysis

DALL-E 3

  • + Features a complex, professional grid layout that resembles a branding case study
  • + Includes vibrant accent blocks which align with the casual dining prompt
  • + High variety of food imagery types including overhead and close-up shots
  • The text is completely illegible gibberish
  • The presentation as four separate booklets/pages is less practical than a single menu sheet
  • Food images contain some AI artifacts and odd colors

Grok Imagine Image Pro

  • + Excellent adherence to the grid layout requirement
  • + High-quality, appetizing food photography with consistent lighting
  • + Text is largely legible and names specific dishes requested in the sections
  • The font is slightly more stylized/playful than a standard 'bold sans-serif'
  • Some spelling errors in descriptions (e.g., 'Avucado', 'Pepperani')
  • Layout is a bit repetitive across the three sections

Verdict: Grok Imagine Image Pro is the clear winner as it produces a functional, legible menu with high-quality food photography that accurately follows the requested sections. DALL-E 3 creates an aesthetically pleasing brand collage, but fails on text legibility and practical utility.

Chalkboard Menu

Text-to-Image

“Handwritten-style chalkboard menu in a cozy café, all text rendered in the exact same realistic chalk handwriting style with natural variations in letter size, slight slant, and chalk texture — no printed or digital fonts anywhere on the board. Title at the top in elegant cursive chalk handwriting: ‘TODAY’S SPECIALS – APRIL 30, 2026’. Below it, three menu items also in the same handwritten chalk style: ‘Truffle Mushroom Risotto – $24’, ‘Grilled Octopus with Lemon & Herbs – $28’, ‘Brown Butter Chocolate Chip Cookies – $9’. At the very bottom, smaller text in the identical handwritten chalk style (slightly smaller but still clearly legible with the same handwriting characteristics): ‘All items made fresh daily • Ask about our gluten-free options’. Warm ambient café lighting, visible chalk dust and smudges, realistic handwriting imperfections, no clean printed text anywhere.”

DALL-E 3
Grok Imagine Image Pro

AI Judge Analysis

DALL-E 3

  • + Excellent chalk texture and artistic aesthetic
  • + Warm, inviting café lighting and atmosphere
  • Numerous spelling errors including 'Trufle', 'Occtus', and 'Grililled'
  • Prices are nonsensical, showing '$234' for a mushroom dish
  • Does not follow the requested menu items accurately

Grok Imagine Image Pro

  • + Perfect text rendering with zero spelling errors
  • + Accurately includes all requested menu items and prices
  • + Very realistic chalk handwriting with natural variations as requested
  • Composition is a bit plain compared to the artistic flair of the other model
  • The bottom line of text appears slightly like a digital font compared to the main items

Verdict: Grok Imagine Image Pro is the clear winner as it followed the complex text instructions perfectly, including specific spelling and pricing. DALL-E 3 produced a more visually pleasing and artistic board, but failed significantly on spelling and adherence to the specific menu items provided in the prompt.

The Reversed Rodeo

Text-to-Image

“Horse riding astronaut in space — horse on top, not vice versa. Surreal, highly detailed, cinematic.”

DALL-E 3
Grok Imagine Image Pro

AI Judge Analysis

DALL-E 3

  • + Strong cinematic lighting and atmosphere
  • + Excellent integration of horse and rider with cosmic clouds
  • Fails to follow the specific spatial instruction of horse on top of the astronaut

Grok Imagine Image Pro

  • + Successfully adheres to the 'horse on top' instruction
  • + Vibrant colors and sharp details on the planetary background
  • Anatomy of the horse legs is slightly distorted
  • Composition feels somewhat cluttered compared to the other model

Verdict: While DALL-E 3 produces a more cohesive and visually pleasing cinematic scene, it fails the logic test of the prompt. Grok Imagine Image Pro successfully follows the difficult instruction to place the horse on top of the astronaut, making it the winner for prompt adherence despite some anatomical flaws.

The Capybara Taxi Driver

Text-to-Image

“Photorealistic scene inside a yellow New York taxi at night. A capybara is driving, wearing a yellow taxi driver cap and a dark jacket. It has a calm, professional expression and both front paws on the steering wheel. In the back seat sits a human businesswoman in a coat, looking at her phone with a completely normal, bored expression (as if this is just another normal ride). Through the windows you can see the streets of Manhattan at night with blurred lights. Realistic taxi interior, photorealistic, detailed fur and fabric, 35mm lens, night lighting with reflections, shallow depth of field.”

DALL-E 3
Grok Imagine Image Pro

AI Judge Analysis

DALL-E 3

  • + Strong cinematic lighting with rich textures on the capybara's fur
  • + Detailed dashboard and interior taxi components
  • + Excellent background depth and atmospheric bokeh
  • Failed to include the requested passenger in the back seat
  • The capybara's hat is black/blue rather than the requested yellow

Grok Imagine Image Pro

  • + Followed all prompt instructions including the bored passenger looking at a phone
  • + Accurately colored yellow cap and dark jacket on the driver
  • + Excellent composition that showcases both the driver and the back-seat passenger
  • The capybara's paws look a bit claw-like and slightly unnatural on the wheel
  • The taxi interior feels a bit more generic compared to the detail in Model A

Verdict: While DALL-E 3 produced a highly detailed and texturally rich image, it completely ignored the instruction to include a passenger. Grok Imagine Image Pro successfully followed all prompt requirements, perfectly capturing the juxtaposition of a professional capybara driver and a bored, normal passenger.

Isometric Miniature Diorama Scenes

Text-to-Image

“Create a clear, 45° top-down isometric miniature 3D cartoon scene of Japan's signature dish: sushi, with soft refined textures, realistic PBR materials, gentle lighting, on a small raised diorama base with minimal garnish and plate. Solid light blue background. At top-center: 'JAPAN' in large bold text, 'SUSHI' below it, small flag icon. Perfectly centered, ultra-clean, high-clarity, square format.”

DALL-E 3
Grok Imagine Image Pro

AI Judge Analysis

DALL-E 3

  • + Excellent 3D miniature diorama feel
  • + Vibrant colors and consistent lighting
  • + Creative interpretation of isometric 3D styling
  • Failed to place the requested text at the top-center
  • Missing the specific 'SUSHI' text
  • Sushi rice structure looks more like beads than rice grains

Grok Imagine Image Pro

  • + Perfect adherence to text placement instructions
  • + Clean, professional PBR-style textures
  • + Accurate 45-degree isometric perspective
  • Composition is slightly off-center vertically
  • Lacks the 'tiny flag' element in the graphic style intended
  • The wood grain on the base has some stretching artifacts on the edges

Verdict: Grok Imagine Image Pro followed the complex layout instructions much better than DALL-E 3, specifically regarding the text placement and content. While DALL-E 3 created a more charming 3D model, its failure to include the required text at the top makes it less accurate to the prompt than Grok Imagine Image Pro.

Adorable Baby Animals in Sunny Meadow

Text-to-Image

“Hyper-photorealistic scene of fluffy baby animals—a golden retriever puppy, tabby kitten, baby bunny, and red fox kit—with big expressive eyes and ultra-detailed soft fur, playfully chasing butterflies and tumbling together in a lush wildflower meadow, warm golden sunrise light with god rays and dew sparkles, joyful wholesome vibe, 8K masterpiece.”

DALL-E 3
Grok Imagine Image Pro

AI Judge Analysis

DALL-E 3

  • + Captures the 'god rays' and 'wholesome vibe' effectively with soft, warm lighting.
  • + Highly detailed fur texture on all animals.
  • Very stylized and cartoonish eyes, failing the 'hyper-photorealistic' part of the prompt.
  • Hallucinates 'butterfly-animal' hybrids that look surreal and bizarre.

Grok Imagine Image Pro

  • + Excellent adherence to 'hyper-photorealistic' with natural anatomy and realistic eyes.
  • + Dynamic 'tumbling' and 'chasing' poses that feel more active and playful.
  • + Beautiful, diverse wildflower meadow with clear dew sparkles.
  • Included two kittens instead of the one requested tabby kitten.
  • The lighting is slightly flatter compared to the dramatic rays in the other model.

Verdict: While DALL-E 3 creates a magical atmosphere, it fails the realism requirement by generating cartoonish facial features and nonsensical butterfly-mammal hybrids. Grok Imagine Image Pro delivers a much more convincing photorealistic scene with natural-looking animals and superior composition, despite adding an extra kitten.

Vintage Cafe Logo

Text-to-Image

“Vintage minimalist restaurant logo for "Caffè Florian", retro cloche dome with steam and "Est. 1720" banner, classic typography, warm brown and cream tones, subtle texture on light background, vector emblem style.”

DALL-E 3
Grok Imagine Image Pro

AI Judge Analysis

DALL-E 3

  • + Excellent texture and woodcut-style detailing
  • + Professional balance of warm brown and cream tones
  • + Correct inclusion of 'Est. 1720' with vintage ornamentation
  • Failed to include the specific name 'Caffè Florian', substituting it with 'COFFEE HOUSE'
  • The steam effect is a bit chunky compared to a minimalist style

Grok Imagine Image Pro

  • + Perfect adherence to the requested name 'Caffè Florian'
  • + Clean minimalist aesthetic that fits a modern vector logo
  • + Accurate representation of all prompt elements including the banner
  • The cloche dome is slightly off-center within the circular border
  • The typography feels a bit basic and lacks 'vintage' character compared to Model A

Verdict: While DALL-E 3 produced a more visually rich and texturally interesting vintage design, it failed the primary identification task by replacing the requested brand name. Grok Imagine Image Pro successfully included all text elements and followed a cleaner minimalist vector style, making it the better choice for brand accuracy despite having less stylistic flair.

Apollo 11: Journey to Tranquility

Text-to-Image

“Create a clean, modern vector infographic poster about the Apollo 11 mission. NASA-inspired palette (navy, white, muted red, light gray). Flat-vector style, crisp lines, consistent iconography, subtle gradients only. Steps (stop at landing): 1. Launch (Saturn Vicon) 2. Earth Orbit (Earth + orbit ring icon) 3. Translunar (trajectory arc icon) 4. Lunar Orbit (Moon + orbit ring icon) 5. Descent (lunar module descending icon) 6. Landing (lunar module on the surface icon) Small supporting elements (minimal text): • Crew strip: three silhouette icons with only last names: Armstrong, Aldrin, Collins. • Landing site marker: Moon pin labeled "Tranquility" only. Layout constraints: generous margins, large readable labels, clean background with subtle stars. Vector-only, print-poster look, high resolution.”

DALL-E 3
Grok Imagine Image Pro

AI Judge Analysis

DALL-E 3

  • + Captures a strong retro-artistic aesthetic consistent with 1960s space age posters.
  • + High level of visual detail and complexity across three variations.
  • Fails to follow the specific 6-step chronological sequence requested.
  • Incorrectly includes space shuttles, which were not part of the Apollo missions.
  • Text is mostly illegible gibberish.

Grok Imagine Image Pro

  • + Perfectly adheres to the requested 6-step infographic structure with matching icons.
  • + Text is highly legible and correctly identifies mission phases and crew members.
  • + Clean, modern vector aesthetic that aligns exactly with the prompt's stylistic requirements.
  • Composition is a bit sparse with significant empty gray space.
  • Iconography is somewhat basic compared to the artistic flair of the other model.

Verdict: Grok Imagine Image Pro is the clear winner as it followed every specific instruction, including the 6-step sequence, specific icon requests, and legible text. DALL-E 3 produced visually interesting posters but failed on the prompt's logical requirements, including incorrect spacecraft (shuttles) and nonsensical text.

Next steps

Explore each model