DALL-E 3 OpenAI Qwen Image 2.0 Alibaba

Settled by community votes across 2 shared challenges, with an AI judge weighing in on each.

DALL-E 3

18.5 arena score

#35 of 44 in Text-to-Image

Skill signature

Not enough comparable category data

The chart appears once both models have ratings across at least three shared arena categories.

Qwen Image 2.0

19.8 arena score

#32 of 44 in Text-to-Image

Vote tally

Where the votes landed

DALL-E 3

100.0%

win rate

Ties

0.0%

Qwen Image 2.0

0.0%

win rate

100.0% 0.0% ties 0.0%

Shared challenges 2

Challenge by challenge

The strongest take from each model on every shared challenge, with the AI judge's read.

The Reversed Rodeo

Text-to-Image

“Horse riding astronaut in space — horse on top, not vice versa. Surreal, highly detailed, cinematic.”

DALL-E 3

Qwen Image 2.0

100% wins 0% ties 0% wins

AI Judge Analysis

DALL-E 3

+ Excellent cinematic lighting and atmospheric depth
+ Surreal composition with consistent glowing nebula elements
+ Strong adherence to the 'cinematic' and 'detailed' descriptors

− Failed the specific spatial instruction (astronaut is riding the horse, not horse riding the astronaut)

Qwen Image 2.0

+ High textural detail on the astronaut suit and horse mane
+ Creative scaly texture on the horse adding to the surreal theme
+ Very sharp resolution and clear foreground focus

− Failed the specific spatial instruction (astronaut is riding the horse, not horse riding the astronaut)
− The horse's legs have anatomical inconsistencies near the joints

Verdict: Both DALL-E 3 and Qwen Image 2.0 suffered from 'semantic bleaching' and ignored the specific negative constraint to have the horse on top of the astronaut. DALL-E 3 is visually superior for its better use of cinematic lighting and a more cohesive surreal environment, whereas Qwen Image 2.0 feels more like a standard stock image with sharper but localized details.

The Halloween Invitation

Text-to-Image

“Vintage gothic Halloween party invitation. Dark parchment poster, spooky border with webs and thorns, central glowing jack-o-lantern, bats, twisted trees, moody night sky. Add elegant gothic title text saying "Halloween Party Invitation", a small scroll banner saying "You are invited to a night of frights", and event details at the bottom: Date: 30.10.2026 Time: 7pm Location: The Arches, NYC Spooky but polished, cinematic lighting, square format.”

DALL-E 3

Qwen Image 2.0

AI Judge Analysis

DALL-E 3

+ Excellent 3D depth and complex layered composition
+ Very high artistic detail in the border, thorns, and webs
+ Moody and cinematic lighting that creates a high-end feel

− Text rendering is mostly nonsensical and gibberish
− The jack-o-lantern is very small in the composition

Qwen Image 2.0

+ Perfect text accuracy for all requested fields including the date and location
+ Strong adherence to all prompt elements including the scroll banner
+ Clear and legible gothic typography

− The composition is a bit flat and less cinematic than requested
− The jack-o-lantern and bats look slightly more generic than the artistic treatment in the other model

Verdict: While DALL-E 3 produces a much more visually stunning and atmospheric piece of art, it fails significantly on text legibility, rendering the invitation unusable. Qwen Image 2.0 followed every specific text instruction perfectly, making it the superior choice for a functional invitation despite having a slightly less complex visual style.