The Capybara Taxi Driver

Text-to-Image Photorealism

19 models were given the same prompt, and the community voted blind on which outputs looked best. How it works

This challenge seems to be difficult for models because it mixes reality with fiction. Most models struggle to keep the taxi realistic or loose instructions like placing the passenger not in the backseat.

Blind Vote This Challenge

#1 — Seedream 5.0 Lite

Prompt

“Photorealistic scene inside a yellow New York taxi at night. A capybara is driving, wearing a yellow taxi driver cap and a dark jacket. It has a calm, professional expression and both front paws on the steering wheel. In the back seat sits a human businesswoman in a coat, looking at her phone with a completely normal, bored expression (as if this is just another normal ride). Through the windows you can see the streets of Manhattan at night with blurred lights. Realistic taxi interior, photorealistic, detailed fur and fabric, 35mm lens, night lighting with reflections, shallow depth of field.”

Voters were asked to judge by Photorealistic Realistic NYC taxi interior + night atmosphere Positioning of passenger Capybara wearing taxi driver cap

Challenge Rankings

19 models

#	Model	Price	¢/img	Elo
1	Seedream 5.0 Lite ByteDance	$$	3.5¢	1225
2	Z-Image Turbo Alibaba	$	0.5¢	1214
3	GPT Image 2 OpenAI	$$	3.9¢	1210
4	Nano Banana Pro Google	$$$	6.7¢	1206
5	Wan 2.6 Alibaba	$$	3¢	1204
6	Seedream 4.5 ByteDance	$$	4¢	1158
7	GPT Image 1 Mini OpenAI	$	0.5¢	1146
8	FLUX.2 [dev] Flash fal	$	0.5¢	1135
9	Nano Banana 2 Google	$$	2.2¢	1133
10	Wan 2.7 Pro Alibaba	$$$	7.5¢	1131
11	Qwen Image 2.0 Pro Alibaba	$$$	7.5¢	1126
12	Recraft V4 Recraft AI	$$	4¢	1115
13	Nano Banana Google	$$	3.9¢	1113
14	FLUX.2 [dev] Turbo fal	$	0.8¢	1112
15	Wan 2.7 Alibaba	$$	3¢	1103
16	Grok Imagine Image Pro xAI	$$$	7¢	1098
17	Recraft V4 Pro Recraft AI	$$$$	25¢	1087
18	DALL-E 2 OpenAI	$$	1.6¢	991
19	DALL-E 3 OpenAI	$$	4¢	990

Seedream 5.0 Lite leads the photorealism challenge with a 100% win rate and 1225 Elo, maintaining an 11-point lead over the budget-friendly Z-Image Turbo. Despite the complexity of the scene composition, the $0.005 Z-Image Turbo outperforms premium models like Nano Banana Pro and Wan 2.7 Pro in both Elo and generation speed.