Head to head
Esc

Models · slot A

to navigate to pick
Arena / Challenges

Text-to-Video challenges

Every text-to-video challenge in the arena, scored with TrueSkill as the votes come in.

The Rubik's Gauntlet

This prompt is one of the hardest single tests for 2026 SOTA video models because it simultaneously demands extreme fine-motor precision at high speed, long-term physical consistency (the cube must genuinely solve without morphing), and complex multi-element rendering (hyper-detailed skin, sweat, glossy reflections, and dynamic camera movement). Areas where even top models still frequently break down.

Best
Sora 2 Pro
Mid
Seedance 2.0
Worst
Grok Imagine Video

The Soul Gauntlet

This is one of the hardest remaining frontiers in 2026 video generation; testing whether models can convey genuine human emotion through subtle facial acting, realistic tear physics, and micro-expressions. While many models can create beautiful faces, very few can deliver emotionally convincing performances without looking uncanny or robotic.

Best
Kling V3 Omni Pro
Mid
Sora 2 Pro
Worst
Grok Imagine Video

Neon Rain Reverie

This prompt is exceptionally difficult because it combines complex fluid dynamics (rain, splashing, clinging wet fabric), advanced material simulation (flowing silk + hair in wind), and atmospheric lighting; three areas where even top 2026 models still frequently produce artifacts or unrealistic behavior.

Best
Seedance 2.0
Mid
Grok Imagine Video
Worst
Sora 2 Pro

The Will Smith Spaghetti Challenge

“Will Smith eating spaghetti” has become the unofficial benchmark of generative video for one simple reason: it is deceptively simple yet brutally revealing. This single prompt exposes weaknesses that flashy action scenes or beautiful landscapes often hide. A model can generate stunning visuals yet still fail spectacularly when asked to make a human being convincingly eat.

Best
Kling V3 Omni Pro
Mid
Wan 2.6
Worst
Sora 2 Pro