Google's latest Imagen 4.0 text-to-image generation model with significantly better text rendering and overall image quality
Settled by community votes across 1 shared challenge, with an AI judge weighing in on each.
Imagen 4.0 Generate 001
#40 of 44 in Text-to-Image
Not enough comparable category data
The chart appears once both models have ratings across at least three shared arena categories.
Z-Image Turbo
#15 of 44 in Text-to-Image
Where the votes landed
Imagen 4.0 Generate 001
0.0%
win rate
Ties
0.0%
Z-Image Turbo
100.0%
win rate
Challenge by challenge
The strongest take from each model on every shared challenge, with the AI judge's read.
Adorable Baby Animals in Sunny Meadow
Text-to-Image“Hyper-photorealistic scene of fluffy baby animals—a golden retriever puppy, tabby kitten, baby bunny, and red fox kit—with big expressive eyes and ultra-detailed soft fur, playfully chasing butterflies and tumbling together in a lush wildflower meadow, warm golden sunrise light with god rays and dew sparkles, joyful wholesome vibe, 8K masterpiece.”
AI Judge Analysis
Imagen 4.0 Generate 001
- + Excellent depiction of the requested lighting including specific 'god rays' and 'dew sparkles'.
- + Stronger visual composition with all four animals clearly distinct and well-rendered.
- + Higher artistic detail in the fur textures and petal clarity.
- − The style leans slightly towards a '3D render' or digital illustration rather than true 'hyper-photorealistic'.
- − The butterflies look somewhat repetitive in design.
Z-Image Turbo
- + Better captures a natural photographic feel and authentic 'tumbling' motion.
- + Excellent interaction between the animals, particularly the puppy's paw on the bunny.
- − The fox kit's face is slightly distorted and lacks the level of detail seen in the other animals.
- − Missing the specific 'god rays' environmental detail requested in the prompt.
- − The background bokeh is a bit blotchy and less refined than Model A.
Verdict: Imagen 4.0 Generate 001 followed the environmental lighting cues much better, specifically capturing the 'god rays' and 'dew sparkles' while maintaining very high clarity across all four animals. Z-Image Turbo captured the 'playful tumbling' energy more effectively, but it fell short on technical rendering quality and the specific atmospheric requirements of the prompt.
Explore each model
Tongyi-MAI's 6-billion parameter distilled text-to-image model optimized for speed, achieving high-quality generation in 8 steps or fewer with support for bilingual text rendering