Imagen 4.0 Ultra Generate 001 vs Stable Diffusion 3.5 Large
Head-to-head across 7 challenges
Imagen 4.0 Ultra Generate 001
33.3%
win rate
Ties
5.6%
Stable Diffusion 3.5 Large
61.1%
win rate
Challenge Results
Geometric Composition
Text-to-Image“A glass cube on a wooden table. Inside the cube is a small blue sphere. On top of the cube sits a red book. A green plant is behind the cube, partially visible through the glass. Soft window light from the left.”
AI Judge Analysis
Imagen 4.0 Ultra Generate 001
- + Perfect adherence to spatial instructions with the book clearly sitting on top of the cube.
- + Very high visual quality with realistic lighting, textures on the book, and wood grain.
- + Correct interpretation of the plant being behind the cube and visible through the glass refraction.
- − The blue sphere appears to be floating inside the cube without a physical support, which looks slightly surreal.
- − The glass cube is rendered more like a solid block of glass/crystal than a hollow cube.
Stable Diffusion 3.5 Large
- + Clean, modern aesthetic with realistic surface imperfections like dust on the glass.
- + Strong lighting and shadow work that grounds the objects onto the table.
- − Failed the spatial requirement: the book is inside the cube rather than on top of it.
- − The glass cube's edges and corners are slightly inconsistent in how they overlap the book.
- − The plant is mostly to the side/front-left rather than behind the cube as requested.
Verdict: Imagen 4.0 Ultra followed every spatial instruction perfectly, correctly placing the red book on top of the glass cube and the plant behind it. Stable Diffusion 3.5 Large failed the primary prompt requirements by placing the book inside the cube and the plant in front, essentially reversing the requested layout.
Candid Street Photography
Text-to-Image“A candid street photo of an elderly Japanese man repairing a red bicycle in light rain, reflections on wet pavement, shallow depth of field, 50mm lens, natural skin texture, imperfect framing, motion blur from passing cars, cinematic but realistic, no stylization.”
AI Judge Analysis
Imagen 4.0 Ultra Generate 001
- + Excellent photographic realism with natural skin textures and believable lighting.
- + Superior rendering of fine details like the beads of rain on the jacket and the mechanical parts of the bike.
- + Successfully achieves the requested shallow depth of field and 50mm look.
- − Misses the 'motion blur from passing cars' requirement as the car in the background is sharp.
- − The framing feels a bit too composed despite the 'imperfect framing' request.
Stable Diffusion 3.5 Large
- + Captures the rainy atmosphere well with visible rain streaks and strong ground reflections.
- + Better adherence to the 'imperfect framing' and 'motion blur' aspect of the prompt with the background vehicles.
- − Anatomical issues with the man's hands and arms which look distorted.
- − The image has a visible painterly/CG sheen that contradicts the 'no stylization' request.
- − The light rain looks more like heavy vertical lines rather than realistic precipitation.
Verdict: Imagen 4.0 Ultra produces a much more realistic and high-quality photograph with stunning detail in the skin and clothing textures. Stable Diffusion 3.5 Large follows the 'motion blur' prompt instruction more closely but fails on a technical level with distorted anatomy and a stylized, less realistic finish.
Modern Clean Menu
Text-to-Image“Modern minimalist restaurant menu design, white background with colorful food photos in grid, sections for appetizers/pizza/mains, bold sans-serif fonts, vibrant accents, clean professional layout for casual dining.”
AI Judge Analysis
Imagen 4.0 Ultra Generate 001
- + Excellent adherence to the grid layout for food photos.
- + Clean, professional typography that is highly legible as a menu.
- + Appropriate categorization of sections (Appetizers, Pizza, Mains) following the prompt.
- − Nonsense text for dish names and descriptions.
- − Small visual artifacts in some of the smaller food thumbnails.
Stable Diffusion 3.5 Large
- + Bold, stylish 'Menu' title and high-quality food photography.
- + Creative use of a vertical center column with side grid elements.
- + High-resolution textures in the food images.
- − Poor adherence to the requested layout; sections are cramped and illegible.
- − Significant spelling errors in headers (e.g., 'MAIMAES' for Mains, 'APPETIZRS').
- − The grid layout feels less like a functional menu and more like a collage.
Verdict: Imagen 4.0 Ultra successfully captures the essence of a modern minimalist menu design with a practical, clean layout and distinct sections for different food types as requested. While Stable Diffusion 3.5 Large has very high-quality food imagery, its layout is chaotic, the text is cluttered, and it fails to create a professional, usable menu structure.
Isometric Miniature Diorama Scenes
Text-to-Image“Create a clear, 45° top-down isometric miniature 3D cartoon scene of Japan's signature dish: sushi, with soft refined textures, realistic PBR materials, gentle lighting, on a small raised diorama base with minimal garnish and plate. Solid light blue background. At top-center: 'JAPAN' in large bold text, 'SUSHI' below it, small flag icon. Perfectly centered, ultra-clean, high-clarity, square format.”
AI Judge Analysis
Imagen 4.0 Ultra Generate 001
- + Perfect adherence to text placement and formatting instructions.
- + Excellent clean, low-poly isometric aesthetic with high clarity.
- + Very accurate 45° top-down perspective and centered diorama.
- − The textures are more plastic-like than 'realistic PBR' materials.
- − The lighting is a bit flat compared to the requested soft refined look.
Stable Diffusion 3.5 Large
- + Good material textures on the salmon and base, showing more depth.
- + Dynamic 3D representation of the flags and vegetation.
- + Vibrant color palette and appealing soft lighting.
- − Failed to place text at 'top-center' as specified, instead attaching it to a flag.
- − The 'Japan' and 'Sushi' text is cluttered and not 'large bold' across the top.
- − Includes excessive garnish/decoration, disregarding the 'minimal' request.
Verdict: Imagen 4.0 Ultra is the clear winner due to its superior adherence to the layout and typography instructions. It correctly placed the text at the top-center of the frame and followed the minimal garnish prompt, whereas Stable Diffusion 3.5 Large integrated the text into the scene on a flag and added much more visual noise than requested.
Adorable Baby Animals in Sunny Meadow
Text-to-Image“Hyper-photorealistic scene of fluffy baby animals—a golden retriever puppy, tabby kitten, baby bunny, and red fox kit—with big expressive eyes and ultra-detailed soft fur, playfully chasing butterflies and tumbling together in a lush wildflower meadow, warm golden sunrise light with god rays and dew sparkles, joyful wholesome vibe, 8K masterpiece.”
AI Judge Analysis
Imagen 4.0 Ultra Generate 001
- + Excellent depiction of the four specific animal species requested.
- + Strong adherence to lighting effects like god rays and dew sparkles.
- + High clarity and vibrant colors throughout the composition.
- − The style leans more toward a digital illustration than the requested hyper-photorealistic look.
- − The butterly sizes and placement feel a bit artificial/arranged.
Stable Diffusion 3.5 Large
- + Successfully captures a more photorealistic look with natural motion blur.
- + Dynamic composition showing the animals actually running/chasing.
- + Soft, realistic fur textures and beautiful bokeh.
- − The fox and the kitten look very similar in facial structure, losing some species distinctness.
- − Does not show the 'dew sparkles' or 'god rays' as clearly as the other model.
- − The kitten's ears are somewhat overly pointed, resembling a lynx or fox more than a tabby kitten.
Verdict: Imagen 4.0 Ultra provided much better adherence to the specific details of the prompt, including the god rays, dew, and distinct animal species, though the style is quite illustrational. Stable Diffusion 3.5 Large achieved a much more realistic photographic aesthetic and a better sense of motion, but it failed to include the requested lighting effects and the animals look somewhat generic. Imagen 4.0 Ultra is the winner for its superior detail and exact species representation.
Vintage Cafe Logo
Text-to-Image“Vintage minimalist restaurant logo for "Caffè Florian", retro cloche dome with steam and "Est. 1720" banner, classic typography, warm brown and cream tones, subtle texture on light background, vector emblem style.”
AI Judge Analysis
Imagen 4.0 Ultra Generate 001
- + Perfect adherence to typography with correct spelling and accent.
- + Clean, professional vector emblem execution.
- + Sophisticated use of subtle background texture and warm tones.
- − The steam is very small and lacks visual impact.
Stable Diffusion 3.5 Large
- + Stronger visual emphasis on the steam concept.
- + Creative use of decorative corner elements and ornaments.
- − Spelling error in the main brand name ('Cafféé' instead of 'Caffè').
- − The cloche illustration appears disjointed and lacks the requested minimalism.
- − The banner composition is cluttered with multiple competing elements.
Verdict: Imagen 4.0 Ultra is the clear winner for its professional execution and accurate text rendering. While Stable Diffusion 3.5 Large attempted a more elaborate design, it failed on basic spelling and the minimalist requirement, resulting in a cluttered logo. Imagen 4.0 Ultra successfully captured the vintage minimalist aesthetic with high-quality vector-style lines.
Apollo 11: Journey to Tranquility
Text-to-Image“Create a clean, modern vector infographic poster about the Apollo 11 mission. NASA-inspired palette (navy, white, muted red, light gray). Flat-vector style, crisp lines, consistent iconography, subtle gradients only. Steps (stop at landing): 1. Launch (Saturn Vicon) 2. Earth Orbit (Earth + orbit ring icon) 3. Translunar (trajectory arc icon) 4. Lunar Orbit (Moon + orbit ring icon) 5. Descent (lunar module descending icon) 6. Landing (lunar module on the surface icon) Small supporting elements (minimal text): • Crew strip: three silhouette icons with only last names: Armstrong, Aldrin, Collins. • Landing site marker: Moon pin labeled "Tranquility" only. Layout constraints: generous margins, large readable labels, clean background with subtle stars. Vector-only, print-poster look, high resolution.”
AI Judge Analysis
Imagen 4.0 Ultra Generate 001
- + Excellent adherence to the infographic structure and layout requested.
- + Clear legibility of English heading text.
- + Strong thematic color palette and consistent iconographic style.
Stable Diffusion 3.5 Large
- + Highly detailed vector-style illustration.
- + Creative interpretation of space schematics.
- + Beautiful texture and lighting effects on the lunar surface.
- − Failed to provide the requested step-by-step infographic structure.
- − Included a Space Shuttle-style vehicle instead of a Saturn V rocket.
- − Text is largely gibberish and placement is cluttered.
Verdict: Imagen 4.0 Ultra successfully followed the complex prompt instructions, creating a logical six-step infographic with readable headers and a clear visual flow. In contrast, Stable Diffusion 3.5 Large produced a visually stunning poster but failed on almost every technical requirement, including the specific rocket type and the sequential layout.
Imagen 4.0 Ultra Generate 001
Google's Imagen 4.0 Ultra model offering the highest fidelity and resolution for professional-grade image generation
Stable Diffusion 3.5 Large
Stability AI's 8.1-billion parameter Multimodal Diffusion Transformer (MMDiT) text-to-image model featuring improved image quality, typography, complex prompt understanding, and resource-efficiency