ARENA Leaderboard
See how AI image models stack up against each other. How it works
Which model makes the best edits?
Same source image, same instruction, blind community votes. See which models handle edits best.
Best AI Models for Image To Video
| # | Model | Elo |
|---|---|---|
| 1 | 1042 | |
| 2 | 991 |
As of June 2026, Alibaba’s Wan 2.6 leads the Image-to-Video leaderboard with an Elo of 1042, maintaining a significant 52-point gap over xAI’s Grok Imagine Video at 990. While Wan 2.6 holds a superior 25.0% win rate, it operates 32% slower than its primary competitor. Despite the performance deficit, the market favors Wan 2.6 for its cost-efficiency, as it delivers top-tier results at a 40% lower price point per generation than Grok.
Elo vs Cost
Elo vs Speed
Challenges
Celebrity Arrival Image-to-Video Cinematic
FAQ
What is the best AI image to video model?
Based on blind community voting, Wan 2.6 is currently the #1 ranked AI image to video model with an Elo rating of 1042. Rankings update in real time as new votes come in.
How are AI image to video models ranked on Lumenfall?
Lumenfall Arena ranks AI models through blind community voting. In each matchup, two models generate from the same prompt and voters pick the better result without seeing model names. Votes are processed using TrueSkill, a Bayesian rating algorithm developed by Microsoft Research, that produces a single Elo score reflecting each model's relative quality.
What is an Elo rating for AI models?
An Elo rating is a numerical score representing a model's skill relative to other models. Under the hood, Lumenfall uses TrueSkill, which tracks two values per model: mu (estimated skill) and sigma (uncertainty). The displayed Elo is calculated as 1000 + 10 x (mu - 3*sigma), a conservative lower bound. A model must prove itself consistently across many matchups to earn a high rating.