Head to head
Esc

Models · slot A

to navigate to pick

The Reversed Rodeo

Vote

This competition tests how well AI image models truly understand language versus how much they rely on visual habits from their training data. The prompt is deliberately simple on the surface but devilishly hard in practice. Most models default to the familiar trope of an astronaut riding a horse. By forcing the reversal, we measure three critical capabilities that separate good models from great ones:

  • Strict instruction following (including negations)
  • Accurate subject-object relationships and spatial hierarchy
  • Resistance to strong dataset biases
Voters judged on
Horse actually on the back of the astronaut one Horse and Astronaut Cinematic atmosphere
The brief Every model got the same prompt
Prompt
Horse riding astronaut in space — horse on top, not vice versa. Surreal, highly detailed, cinematic.
The leaderboard

Challenge rankings

17 models · ranked by blind vote

Through time: how it evolved
1
GPT Image 2
score 22.9
0
2
Seedream 4.5
score 17.1
0
3
Nano Banana Pro
score 16.0
0
4
Qwen Image 2.0
score 14.9
0
5
DALL-E 3
score 13.4
0
6
FLUX.1 [schnell] FP8
score 12.3
0
7
Recraft V4 Pro
score 11.0
0
8
Grok Imagine Image Pro
score 10.5
0
9
Wan 2.7
score 9.5
0
10
GPT Image 1.5
score 6.3
0
11
Recraft V4
score 5.7
0
12
DALL-E 2
score 5.0
0
13
Stable Diffusion 3.5 Medium
score 3.8
0
14
Nano Banana 2
score 0.0
0
FLUX.2 [dev] Flash
awaiting votes
0
FLUX.2 [dev]
awaiting votes
0
FLUX.2 [dev] Turbo
awaiting votes
0
Head to head 4 matchups

Notable battles

The matchups worth a second look from this challenge's blind voting: the closest rivalries, the biggest upsets (a lower seed taking down a favorite), and the clashes at the top of the board. Each bar shows how the community split its votes.

Want to compare other models? Pick any two and see them go head to head.
The archive

Through time

How The Reversed Rodeo has evolved, release by release.

2022 to 2026· 5 reigns· 17 models
The Reign Chain
5 reigns

How the crown changed hands

Every model that has held #1 on this challenge, in order. Click any era to travel through time.

  1. R01
    Inception
    DALL-E 2
    Apr 2022 Oct 2023
    reigned 1.5 yrs
  2. R02
    DALL-E 3
    Oct 2023 Nov 2025
    reigned 2.1 yrs · overtook DALL-E 2
  3. R03
    Gemini 3 Pro Image Preview
    Nov 2025 Dec 2025
    reigned 1 mo · overtook DALL-E 3
  4. R04
    Seedream 4.5
    Dec 2025 Apr 2026
    reigned 4 mo · overtook Gemini 3 Pro Image Preview
  5. R05
    Current
    GPT Image 2
    Apr 2026 today
    reigned 2 mo · overtook Seedream 4.5

Generation history

17 models · newest first
Show full history 12
  1. GPT Image 2 3 takes
  2. Wan 2.7 3 takes
  3. Recraft V4 3 takes
  4. Recraft V4 Pro 3 takes
  5. 12 more models Apr 2022 – Feb 2026 Show