Grok Imagine Image Pro xAI Qwen Image 2.0 Alibaba

Settled by community votes across 1 shared challenge, with an AI judge weighing in on each.

Grok Imagine Image Pro

24.8 arena score

#14 of 44 in Text-to-Image

Skill signature

Not enough comparable category data

The chart appears once both models have ratings across at least three shared arena categories.

Qwen Image 2.0

19.8 arena score

#32 of 44 in Text-to-Image

Vote tally

Where the votes landed

Grok Imagine Image Pro

win rate

Ties

Qwen Image 2.0

win rate

Shared challenges 1

Challenge by challenge

The strongest take from each model on every shared challenge, with the AI judge's read.

The Reversed Rodeo

Text-to-Image

“Horse riding astronaut in space — horse on top, not vice versa. Surreal, highly detailed, cinematic.”

Grok Imagine Image Pro

Qwen Image 2.0

AI Judge Analysis

Grok Imagine Image Pro

+ Perfect adherence to the unusual 'horse on top' request
+ Vibrant, cinematic colors and lighting
+ High detail in the nebula and planetary background

− The horse appears to be floating just above rather than 'riding' in a traditional physical sense

Qwen Image 2.0

+ High textural detail on the horse and space suit
+ Good composition with Earth in the background

− Completely failed the negative constraint/specific instruction of 'horse on top'
− The scales on the horse's neck look a bit muddy and inconsistent

Verdict: The main differentiator was the specific prompt instruction to have the 'horse on top'. Grok Imagine Image Pro followed this surreal request perfectly, creating an interesting and literal interpretation, whereas Qwen Image 2.0 ignored the specific instruction and generated a standard astronaut riding a horse.

Next steps

Explore each model

Grok Imagine Image Pro

xAI

xAI's premium image generation model offering higher fidelity output and stronger performance on single-image editing benchmarks compared to the standard Grok Imagine model

Vote this model in the arena

Arena profile Lumenfall catalog

Qwen Image 2.0

Alibaba

Alibaba's Qwen Image 2.0 model with enhanced text rendering, supporting both Chinese and English prompts with up to 6 images per request

Vote this model in the arena

Arena profile Lumenfall catalog