The Rubik's Gauntlet
Vote5 models were given the same prompt, and the community voted blind on which outputs looked best. How it works
This prompt is one of the hardest single tests for 2026 SOTA video models because it simultaneously demands extreme fine-motor precision at high speed, long-term physical consistency (the cube must genuinely solve without morphing), and complex multi-element rendering (hyper-detailed skin, sweat, glossy reflections, and dynamic camera movement). Areas where even top models still frequently break down.
#1 — Sora 2 Pro
Challenge Rankings
| # | Model | Elo |
|---|---|---|
| 1 | 1224 | |
| 2 | 1191 | |
| 3 | 1117 | |
| 4 | 987 | |
| 5 | 859 |
Sora 2 Pro leads the leaderboard with an Elo of 1224, though Kling V3 Omni Pro maintains a higher 85.7% win rate despite a 33-point Elo gap. Premium tier models dominate this high-complexity prompt, while the Elo drop-off between second and third place exceeds 120 points, marking a significant performance divide in handling complex physical consistency.
Elo vs Cost
Elo vs Speed
Speed data is still warming up
We only have enough recent requests for Grok Imagine Video (65.4s average).