OpenAI's previous generation image model with higher quality than DALL-E 2 and support for larger resolutions
Settled by community votes across 1 shared challenge, with an AI judge weighing in on each.
DALL-E 3
#35 of 44 in Text-to-Image
Not enough comparable category data
The chart appears once both models have ratings across at least three shared arena categories.
Imagen 4.0 Generate 001
#40 of 44 in Text-to-Image
Where the votes landed
DALL-E 3
0%
win rate
Ties
0%
Imagen 4.0 Generate 001
0%
win rate
Challenge by challenge
The strongest take from each model on every shared challenge, with the AI judge's read.
Adorable Baby Animals in Sunny Meadow
Text-to-Image“Hyper-photorealistic scene of fluffy baby animals—a golden retriever puppy, tabby kitten, baby bunny, and red fox kit—with big expressive eyes and ultra-detailed soft fur, playfully chasing butterflies and tumbling together in a lush wildflower meadow, warm golden sunrise light with god rays and dew sparkles, joyful wholesome vibe, 8K masterpiece.”
AI Judge Analysis
DALL-E 3
- + Strong magical atmosphere with prominent god rays and a soft, warm glow.
- + High levels of fluffiness and expressive, large eyes that convey a wholesome vibe.
- + Excellent use of lighting to create depth and focus on the characters.
- − Anatomical surrealism where butterflies have animal faces.
- − The kitten is very small and lacks distinct tabby markings compared to the prompt.
- − Fails the 'photorealistic' requirement in favor of a stylized Pixar-like aesthetic.
Imagen 4.0 Generate 001
- + Successfully includes all four animals with accurate markings, including a clear tabby kitten.
- + Better adherence to 'photorealistic' textures while maintaining the playful action requested.
- + Highly detailed meadow with clear dew sparkles on the grass and flowers.
- − The fox kit's proportions and pose feel slightly stiff or taxidermy-like.
- − The lighting is a bit flat compared to the dramatic 'god rays' seen in the other model.
- − Composition is slightly cluttered with large flowers in the foreground blocking the scene.
Verdict: While DALL-E 3 creates a more emotionally evocative and 'magical' image, it fails on technical details by giving the butterflies animal heads and ignoring the tabby pattern. Imagen 4.0 Generate 001 provides a much more accurate interpretation of the prompt's specific subjects and the requested photorealistic style, despite having slightly less dramatic lighting.
Explore each model
Google's latest Imagen 4.0 text-to-image generation model with significantly better text rendering and overall image quality