ARENA Leaderboard
See how AI image models stack up against each other. How it works
Which model creates the best videos from text?
Ranked by blind votes in side-by-side matchups. Voters watch the videos, not the model names.
Best AI Models for Text To Video
| # | Model | Elo |
|---|---|---|
| 1 |
Sora 2 Pro
|
1187 |
| 2 | P-Video PrunaAI | 1115 |
| 3 |
Grok Imagine Video
|
1075 |
As of May 2026, OpenAI’s Sora 2 Pro leads the leaderboard with an Elo of 1188 and a 46.4% win rate, maintaining a significant 73-point lead over its closest rival. While PrunaAI’s P-Video holds the second position with an 1115 Elo, xAI’s Grok Imagine Video ranks third with an 1053 Elo and a lower 17.6% win rate. Despite the performance gap at the top, P-Video offers a notable price-to-performance dynamic, rivaling the leader's quality at 1/15th the cost per generation.
Elo vs Cost
Elo vs Speed
Speed data is still warming up
We only have enough recent requests for Grok Imagine Video (44.6s average).
Challenges
Neon Rain Reverie Text-to-Video
The Soul Gauntlet Text-to-Video
The Rubik's Gauntlet Text-to-Video
FAQ
What is the best AI text to video model?
Based on blind community voting, Sora 2 Pro is currently the #1 ranked AI text to video model with an Elo rating of 1187. Rankings update in real time as new votes come in.
How are AI text to video models ranked on Lumenfall?
Lumenfall Arena ranks AI models through blind community voting. In each matchup, two models generate from the same prompt and voters pick the better result without seeing model names. Votes are processed using TrueSkill, a Bayesian rating algorithm developed by Microsoft Research, that produces a single Elo score reflecting each model's relative quality.
What is an Elo rating for AI models?
An Elo rating is a numerical score representing a model's skill relative to other models. Under the hood, Lumenfall uses TrueSkill, which tracks two values per model: mu (estimated skill) and sigma (uncertainty). The displayed Elo is calculated as 1000 + 10 x (mu - 3*sigma), a conservative lower bound. A model must prove itself consistently across many matchups to earn a high rating.