xAI's premium image generation model offering higher fidelity output and stronger performance on single-image editing benchmarks compared to the standard Grok Imagine model
Settled by community votes across 7 shared challenges, with an AI judge weighing in on each.
Grok Imagine Image Pro
#14 of 44 in Text-to-Image
Not enough comparable category data
The chart appears once both models have ratings across at least three shared arena categories.
Imagen 4.0 Ultra Generate 001
#28 of 44 in Text-to-Image
Where the votes landed
Grok Imagine Image Pro
33.3%
win rate
Ties
0.0%
Imagen 4.0 Ultra Generate 001
66.7%
win rate
Challenge by challenge
The strongest take from each model on every shared challenge, with the AI judge's read.
Geometric Composition
Text-to-Image“A glass cube on a wooden table. Inside the cube is a small blue sphere. On top of the cube sits a red book. A green plant is behind the cube, partially visible through the glass. Soft window light from the left.”
AI Judge Analysis
Grok Imagine Image Pro
- + Excellent adherence to the glass physics, correctly showing the plant behind the cube.
- + Highly realistic texture on the wooden table and sphere.
- + Text on the book is clear and relevant to the prompt.
- − The glass cube looks more like an open glass case rather than a solid or fully enclosed cube.
Imagen 4.0 Ultra Generate 001
- + Strong cinematic lighting and composition.
- + Clean, solid glass cube geometry.
- + The floating sphere adds a creative, surreal touch.
- − Fails to show the green plant through the glass of the cube as requested.
- − The sphere appears to be floating unnaturally compared to the gravity-based arrangement of other items.
Verdict: Grok Imagine Image Pro adhered much better to the specific spatial instructions, successfully rendering the green plant behind the cube and visible through the glass. While Imagen 4.0 Ultra Generate 001 produced a more stylized and polished image, it missed the key prompt requirement of showing the plant through the glass and opted for a less realistic floating sphere.
Candid Street Photography
Text-to-Image“A candid street photo of an elderly Japanese man repairing a red bicycle in light rain, reflections on wet pavement, shallow depth of field, 50mm lens, natural skin texture, imperfect framing, motion blur from passing cars, cinematic but realistic, no stylization.”
AI Judge Analysis
Grok Imagine Image Pro
- + Excellent composition that captures the 'candid street' atmosphere perfectly.
- + Natural integration of the bicycle and tools with the man's hands.
- + Realistic motion blur and reflections on the wet pavement.
- − The man's skin texture is a bit smooth compared to the requested 'natural skin texture'.
Imagen 4.0 Ultra Generate 001
- + Outstanding natural skin texture and facial detail.
- + Accurately represents the 'light rain' through visible droplets on the jacket and ground.
- + Close-up framing feels very personal and cinematic.
- − The man's hands are interacting awkwardly with the tools, which appear to be floating or disconnected.
- − The motion blur on the background car is less realistic than in Model A.
Verdict: Both models followed the prompt well, but Grok Imagine Image Pro produced a more coherent scene with better physical interactions between the subject and the bicycle. While Imagen 4.0 Ultra provided superior skin texture and rain details, the anatomical and mechanical errors in the hands and tools make it less successful as a realistic photograph.
Modern Clean Menu
Text-to-Image“Modern minimalist restaurant menu design, white background with colorful food photos in grid, sections for appetizers/pizza/mains, bold sans-serif fonts, vibrant accents, clean professional layout for casual dining.”
AI Judge Analysis
Grok Imagine Image Pro
- + Excellent adherence to sections for appetizers, pizza, and mains.
- + Highly legible headers and structured item descriptions.
- + High-quality, distinct food photography that matches the labels.
- − Repeats the same 'Prosciutto Arugula' description for several items.
- − Text font for descriptions is slightly stylized/jagged rather than a clean modern sans-serif.
Imagen 4.0 Ultra Generate 001
- + Effective use of a modern bold sans-serif font for headers.
- + Clean, minimalist aesthetic with nice color swatch accents.
- − Failed to organize the grid by the requested categories, placing pizza under appetizers and mixed items everywhere.
- − Text is largely gibberish for item names and descriptions.
- − Layout is unbalanced with significant empty space at the bottom right.
Verdict: Grok Imagine Image Pro successfully followed the structural requirements of the prompt, creating distinct and logical sections for the different types of food. While its text descriptions contain repetitive errors, the overall layout is professional and functional. Imagen 4.0 Ultra Generate 001 provides a nice aesthetic but completely fails at logical organization and includes illegible text and a disorganized grid.
Isometric Miniature Diorama Scenes
Text-to-Image“Create a clear, 45° top-down isometric miniature 3D cartoon scene of Japan's signature dish: sushi, with soft refined textures, realistic PBR materials, gentle lighting, on a small raised diorama base with minimal garnish and plate. Solid light blue background. At top-center: 'JAPAN' in large bold text, 'SUSHI' below it, small flag icon. Perfectly centered, ultra-clean, high-clarity, square format.”
AI Judge Analysis
Grok Imagine Image Pro
- + Excellent 3D material rendering with realistic wood grain and subsurface scattering on the fish.
- + Clean, professional typography that feels integrated into the scene.
- + Perfectly follows the 45-degree isometric perspective requested.
- − The 'diorama base' is interpreted as a simple wooden plate rather than a distinct architectural base.
- − Texture on the rice grains is slightly repetitive.
Imagen 4.0 Ultra Generate 001
- + Stronger 'diorama' aesthetic with a distinct square raised base as requested.
- + High level of variety in sushi types (Tamago, Gunkan, Nigiri) which adds visual interest.
- + Great clarity and vibrant cartoonish color palette.
- − Black text on blue lacks the premium feel and lighting integration of the other model.
- − The flag icon is a bit large and disjointed from the text layout.
Verdict: Both models followed the prompt exceptionally well, but Grok Imagine Image Pro produces a more polished, high-end 3D render with superior lighting and material realism. Imagen 4.0 Ultra Generate 001 captures the 'diorama' aspect better with its square base, but the overall composition and text styling in Grok's output feel more cohesive and professional.
Adorable Baby Animals in Sunny Meadow
Text-to-Image“Hyper-photorealistic scene of fluffy baby animals—a golden retriever puppy, tabby kitten, baby bunny, and red fox kit—with big expressive eyes and ultra-detailed soft fur, playfully chasing butterflies and tumbling together in a lush wildflower meadow, warm golden sunrise light with god rays and dew sparkles, joyful wholesome vibe, 8K masterpiece.”
AI Judge Analysis
Grok Imagine Image Pro
- + Excellent depiction of god rays and morning atmospheric haze
- + Includes two kittens which adds to the playfulness
- + Superior integration of animals into the ground plane with realistic shadows
- − The fox kit has slightly strange, human-like paw positioning while on its back
- − Background trees are a bit blurry compared to the foreground sharpness
Imagen 4.0 Ultra Generate 001
- + Beautiful dew sparkles on the blades of grass
- + Very expressive facial features on all animals
- + High contrast and vibrant color palette
- − Missing the 'golden retriever' specific look, appearing more like a generic yellow pup
- − The 'god rays' effects are a bit more stylized and less photorealistic than Model A
- − Composition feels slightly flattened with the animals arranged in a single row
Verdict: Grok Imagine Image Pro wins by providing a more grounded and photorealistic interpretation of the scene, particularly with the lighting and the natural placement of the animals in the meadow. Imagen 4.0 Ultra Generate 001 provides excellent detail on the animals and dew drops, but the lighting and overall composition feel more like a digital illustration than the requested 8K masterpiece photograph.
Vintage Cafe Logo
Text-to-Image“Vintage minimalist restaurant logo for "Caffè Florian", retro cloche dome with steam and "Est. 1720" banner, classic typography, warm brown and cream tones, subtle texture on light background, vector emblem style.”
AI Judge Analysis
Grok Imagine Image Pro
- + Excellent typography with correct accents
- + Clean vector-style execution
- + Strong use of requested warm brown and cream tones
- − The 'Est. 1720' is on a geometric shape rather than a banner
- − The cloche is silver/grey which clashes slightly with the warm palette
Imagen 4.0 Ultra Generate 001
- + Perfect adherence to the 'banner' requirement for the date
- + Elegant minimalist linework consistent with a vintage emblem
- + Subtle hachure shading on the cloche adds to the vintage feel
- − The accent on 'CAFÈ' is a bit small and low compared to Model A
- − Composition feels slightly empty with the large amount of negative space
Verdict: Both models followed the prompt accurately, but Imagen 4.0 Ultra Generate 001 is the winner for its superior interpretation of the 'banner' and 'vintage minimalist' style, using fine-line engraving details. Grok Imagine Image Pro produced a very clean logo, but failed to include an actual banner and used a high-contrast silver cloche that felt less 'vintage' than the monochromatic approach of the other model.
Apollo 11: Journey to Tranquility
Text-to-Image“Create a clean, modern vector infographic poster about the Apollo 11 mission. NASA-inspired palette (navy, white, muted red, light gray). Flat-vector style, crisp lines, consistent iconography, subtle gradients only. Steps (stop at landing): 1. Launch (Saturn Vicon) 2. Earth Orbit (Earth + orbit ring icon) 3. Translunar (trajectory arc icon) 4. Lunar Orbit (Moon + orbit ring icon) 5. Descent (lunar module descending icon) 6. Landing (lunar module on the surface icon) Small supporting elements (minimal text): • Crew strip: three silhouette icons with only last names: Armstrong, Aldrin, Collins. • Landing site marker: Moon pin labeled "Tranquility" only. Layout constraints: generous margins, large readable labels, clean background with subtle stars. Vector-only, print-poster look, high resolution.”
AI Judge Analysis
Grok Imagine Image Pro
- + Perfect adherence to all six requested steps and their specific iconography.
- + Excellent text legibility and spelling for both primary and supporting details.
- + Clean, professional vertical layout that aligns with the 'modern vector infographic' style.
- − The spacing between the top title and the first icon is slightly tight.
- − Minor rendering artifact on the 'Descent' lunar module icon where the capsule meets the legs.
Imagen 4.0 Ultra Generate 001
- + Accurately follows the requested NASA color palette.
- + The central NASA-style logo is well-rendered and visually pleasing.
- − Completely fails to follow the logical 6-step timeline requested.
- − Text is largely gibberish (e.g., 'MIOLLO 11', 'MASED') and repetitive.
- − The layout is cluttered and confusing, not meeting the 'clean' and 'crisp' requirement.
Verdict: Grok Imagine Image Pro successfully followed the complex prompt, accurately depicting all six specific mission steps with correct spelling and iconography. In contrast, Imagen 4.0 Ultra Generate 001 produced a generic, cluttered layout filled with nonsensical text and failed to follow the sequential numbering or specific step descriptions.
Explore each model
Google's Imagen 4.0 Ultra model offering the highest fidelity and resolution for professional-grade image generation