State of AI Media Generation

Which AI Makes the Best Images? We Had Humans Vote 3,500 Times to Find Out

Till 6 mins read

State of AI Image Generation, Q1 2026
Blind rankings from 3,500+ human votes across 29 models

There's no shortage of opinions about which AI image model is "best", but there's a severe shortage of data. We built the Lumenfall Arena to fix that. Since February 2026, human voters have judged 3,509 blind, head-to-head matchups between 29 AI image models from 12 organizations.

Street photography, fantasy warriors, logo design, floral mandalas, isometric dioramas. Same prompt, two images, hidden model names. Users picked the one they liked better. Here's what we found.

The rankings: Image generation

Rank

Model

Creator

Elo

Win Rate

Battles

1

Gemini 3.1 Flash Image Preview (Nano Banana 2)

Google

1299

79.7%

74

2

GPT Image 1.5

OpenAI

1283

69.9%

216

3

Gemini 3 Pro Image Preview (Nano Banana Pro)

Google

1277

64.6%

175

4

FLUX.2 [dev] Turbo

fal

1269

56.4%

181

5

FLUX.2 [max]

Black Forest Labs

1266

58.7%

167

6

FLUX.2 [dev] Flash

fal

1262

57.9%

126

7

Seedream 4.5

ByteDance

1261

58.9%

158

8

ImagineArt 1.5 (Preview)

Vyro AI

1260

57.1%

170

9

FLUX.2 [pro]

Black Forest Labs

1255

53.8%

145

10

Z-Image Turbo

Alibaba

1252

46.5%

198

Full rankings for all 29 models on https://lumenfall.ai/arena

Five things that stood out for us

  1. Google swept the top spots with their Nano Banana models. Gemini 3.1 Flash Image Preview (Nano Banana 2) is #1 with a 79.7% win rate. Its sibling, Gemini 3 Pro Image Preview (Nano Banana Pro), is #3. The Nano Banana family is clearly leading Google's image generation performance right now. One caveat: Nano Banana 2 has only 74 battles so far, since it's a relatively new model.
  2. The FLUX.2 variants are remarkably close and the cheap ones are winning. Black Forest Labs and fal's distilled versions together hold five of the top nine spots, but the spread between them is razor-thin: just 15 Elo points separate FLUX.2 [dev] Turbo at #4 from FLUX.2 [pro] at #9. The more interesting story is the ordering. fal's budget-friendly [dev] Turbo (0.8¢/image) and [dev] Flash (0.5¢/image) rank higher than the pricier FLUX.2 [max] (3¢) and FLUX.2 [pro] (1.5¢). The cheaper distilled variants aren't just keeping up with the flagship models — they're outperforming them. If you're evaluating FLUX.2, you may not need the premium tier.
  3. GPT Image 1.5 is quietly one of the strongest models in the arena. It sits at #2 overall with a 69.9% win rate across 216 matchups — the most battles of any model in the top three. Despite that, it gets a fraction of the attention that Google's Nano Banana models receive. The Nano Banana launch dominated social media and tech coverage; GPT Image 1.5 just kept winning matchups. It's only 16 Elo points behind the top-ranked Nano Banana 2, and on the cost side it starts at 0.9¢ per image on the low-quality setting — though higher quality tiers cost more. If you're picking a model based on data rather than hype, this one deserves a closer look.
  4. The biggest surprises aren't from the biggest companies. ByteDance's Seedream 4.5 sits at #7 with a 58.9% win rate across 158 battles, wedged between FLUX.2 variants in one of the most competitive parts of the table. Vyro AI's ImagineArt 1.5 holds #8 with 57.1% across 170 battles. Neither company dominates the Western AI image conversation, but both are outperforming brands with much more mindshare. ByteDance's newer Seedream 5.0 Lite has also entered the arena (currently ~#13 with fewer battles so far) and shows early promise, especially in editing.
  5. Image editing rankings look nothing like generation rankings. Our editing leaderboard (1,517 matchups across 16 models) reshuffles the deck:

Rank

Model

Creator

Elo

Battles

1

Gemini 3 Pro Image Preview (Nano Banana Pro)

Google

1245

243

2

Qwen Image Edit 2511

Alibaba

1230

546

3

GPT Image 1.5

OpenAI

1230

190

4

FLUX.2 [flex]

Black Forest Labs

1227

102

5

Gemini 2.5 Flash Image (Nano Banana)

Google

1227

215

FLUX.2 [flex] jumps from 16th in generation to 4th in editing. Alibaba's Qwen Image Edit 2511 is the editing workhorse with a 56.2% win rate; notable given that the corresponding generation model, Qwen Image 2512, sits all the way down at #18 on the text-to-image leaderboard. Generation skill doesn't predict editing skill. Treat them as separate decisions.

Organization rankings

Which company has the strongest portfolio overall?

Organization

Models

Avg Elo

Best Model

fal

2

1266

FLUX.2 [dev] Turbo

OpenAI

2

1265

GPT Image 1.5

Vyro AI

1

1260

ImagineArt 1.5

Black Forest Labs

4

1249

FLUX.2 [max]

ByteDance

3

1248

Seedream 4.5

xAI

2

1236

Grok Imagine Image Pro

Google

6

1228

Nano Banana 2

How the arena works

We use TrueSkill, Microsoft's Bayesian rating system, which updates both a model's estimated skill and the system's confidence in that estimate after every matchup. We display the results as Elo scores (starting at 1000) because most people understand the scale from chess. Beating a higher-rated model earns more points than beating a lower-rated one, same as Elo, but TrueSkill converges faster and handles models with fewer battles more carefully.

Voters see two images generated from the same prompt. They don't know which model made which image. They pick the better one, or call it a tie.

A few numbers on the integrity of the data:

  • Left-side images won 51.0% of decisive (non-tie) votes, close to the 50% you'd expect if position doesn't matter.
  • Only 7.8% of matchups were ties, meaning voters could tell the difference most of the time.
  • 20 competitions cover prompts ranging from "Adorable Baby Animals in Sunny Meadow" to "Apollo 11: Journey to Tranquility" to "Vintage Cafe Logo."

Rankings update in real-time on lumenfall.ai/leaderboard.

What to pick if you're building something

If you need the best output and cost isn't the constraint: Gemini 3.1 Flash Image Preview (Nano Banana 2) or GPT Image 1.5. Both are available through lumenfall.ai with a single integration.

If you're watching costs: the FLUX.2 Turbo and Flash variants through fal are cheaper and still land in the top 6. Alibaba's Z-Image Turbo is another strong budget pick: It sits at #10 with a price of just 0.5¢ per image, making it one of the best value-for-quality options on the board.

If you need image editing: look at Gemini 3 Pro Image Preview (Nano Banana Pro) and FLUX.2 [flex]. A model that generates well doesn't necessarily edit well. The rankings are different enough that you should treat these as separate decisions.

What's next

This is the first edition. We'll publish a Q2 update in July with more models and more votes. The arena is live and the rankings shift as new votes come in. 3,509 votes from 268 participants is a real start, not a definitive verdict. We're being transparent about sample sizes because we think that makes the data more useful, not less. As the vote count grows, so will our confidence in the tighter matchups.

If you want a say in the rankings: https://lumenfall.ai/arena/vote
If you want to run these models: lumenfall.ai