Text-to-Image leaderboard
Models ranked by blind head-to-head votes. Scores are Elo ratings and update as new matchups complete.
Generation highlights
How the top models compare
Best models, by
Best Text-to-Image Models by Price
1 model without pricing omitted.
Best Text-to-Image Models by Speed
8 models waiting for enough speed data.
Best AI Models for Text-to-Image
| # | Model | Elo |
|---|---|---|
| 1 | 1285 | |
| 2 | 1284 | |
| 3 | 1282 | |
| 4 | 1272 | |
| 5 | 1270 | |
| 6 | 1270 | |
| 7 | 1268 | |
| 8 | 1263 | |
| 9 | 1263 | |
| 10 | 1259 |
| 11 | 1259 | |
| 12 | 1252 | |
| 13 | 1251 | |
| 14 | 1249 | |
| 15 | 1248 | |
| 16 | 1247 | |
| 17 | 1246 | |
| 18 | 1245 | |
| 19 | 1244 | |
| 20 | 1242 | |
| 21 | 1237 | |
| 22 | 1236 | |
| 23 | 1234 | |
| 24 | 1231 | |
| 25 | 1229 | |
| 26 | 1226 | |
| 27 | 1223 | |
| 28 | 1223 | |
| 29 | 1222 | |
| 30 | 1221 |
As of July 2026, Google’s Nano Banana 2 leads the arena with a 1285 Elo, narrowly outpacing OpenAI's GPT Image 2 by just 1 Elo point despite the latter having a superior 85.7% win rate. The top of the board is highly competitive, with only 3 Elo separating the top three models, including Google’s Nano Banana Pro at 1282 Elo. Value remains a major factor in the top five, as the budget-friendly FLUX.2 [dev] Turbo ($0.008/img) currently holds the #4 spot, outperforming several premium-priced models with significantly higher operational costs.
Rivalries
Aggregate head-to-head across the arena
Highlighted challenges
The Reversed Rodeo
This competition tests how well AI image models truly understand language versus how much they rely on visual habits from their training data. The prompt is deliberately simple on the surface but devilishly hard in practice. Most models default to the familiar trope of an astronaut riding a horse. By forcing the reversal, we measure three critical capabilities that separate good models from great ones: Strict instruction following (including negations) Accurate subject-object relationships and spatial hierarchy Resistance to strong dataset biases
Geometric Composition
A spatial-reasoning test. Each object has a precise relationship (inside, on top, behind, seen through the glass), so it measures whether a model follows explicit placement instructions and handles transparency and refraction rather than just approximating the scene.
Recent SOTA shifts in Text-to-Image
Full historyFAQ
What is the best AI text-to-image model?
Based on blind community voting, Nano Banana 2 is currently the #1 ranked AI text-to-image model with an Elo rating of 1285. Rankings update in real time as new votes come in.
How are AI text-to-image models ranked on Lumenfall?
Lumenfall Arena ranks AI models through blind community voting. In each matchup, two models generate from the same prompt and voters pick the better result without seeing model names. Votes are processed using TrueSkill, a Bayesian rating algorithm developed by Microsoft Research, that produces a single Elo score reflecting each model's relative quality.
What is an Elo rating for AI models?
An Elo rating is a numerical score representing a model's skill relative to other models. Under the hood, Lumenfall uses TrueSkill, which tracks two values per model: mu (estimated skill) and sigma (uncertainty). The displayed Elo is calculated as 1000 + 10 x (mu - 3*sigma), a conservative lower bound. A model must prove itself consistently across many matchups to earn a high rating.
Keep the arena honest
Cast your vote
Pick winners in blind matchups. Every vote nudges the Elo and shapes these rankings.
Cast Your VoteSuggest a prompt
Got an idea worth testing? Submit a prompt and watch the models battle it out.