Nano Banana Pro vs Grok Imagine Image
Head-to-head across 17 challenges
Nano Banana Pro
62.9%
win rate
Ties
11.4%
Grok Imagine Image
25.7%
win rate
Challenge Results
Geometric Composition
Text-to-Image“A glass cube on a wooden table. Inside the cube is a small blue sphere. On top of the cube sits a red book. A green plant is behind the cube, partially visible through the glass. Soft window light from the left.”
AI Judge Analysis
Nano Banana Pro
- + Excellent photorealistic textures on the wooden table and the weathered red book.
- + Perfect adherence to the lighting instruction with realistic shadows and highlights from the left window.
- + Highly realistic rendering of glass refraction and reflections on the sphere and cube surfaces.
- − The plant's pot is technically next to the cube rather than directly behind it, though leaves do overlap.
Grok Imagine Image
- + Correct placement of all objects according to the prompt.
- + Clean, vibrant colors and a high-quality digital look.
- − The blue sphere is levitating unrealistically in the center of the cube.
- − The glass cube lacks realistic thickness and refraction at the edges compared to Model A.
- − The plant and pot appear slightly blurred/unfocused in a way that feels less natural.
Verdict: Gemini 3 Pro Image Preview produces a significantly more realistic and tactile image, with naturalistic lighting and weathered textures that make the scene feel authentic. While Grok Imagine followed the spatial instructions well, it suffered from physical inaccuracies like a floating sphere and less convincing glass rendering. Gemini 3 Pro is the clear winner for its superior visual quality and expert handling of light and materials.
Candid Street Photography
Text-to-Image“A candid street photo of an elderly Japanese man repairing a red bicycle in light rain, reflections on wet pavement, shallow depth of field, 50mm lens, natural skin texture, imperfect framing, motion blur from passing cars, cinematic but realistic, no stylization.”
AI Judge Analysis
Nano Banana Pro
- + Expertly captures the 'candid street photo' aesthetic with a highly realistic, weathered skin texture on the man.
- + The bicycle mechanics and the man's posture are much more coherent and believable.
- + Excellent environmental details, including realistic raindrop ripples on pavement and authentic Japanese street elements.
- − Lacks the specific 'motion blur from passing cars' requested in the prompt, as the cars appear relatively static.
- − The framing is slightly more balanced than the requested 'imperfect framing'.
Grok Imagine Image
- + Successfully captures the motion blur of a passing car as requested.
- + Good implementation of 'imperfect framing' with a lower, more candid-style angle.
- + Effectively depicts wet pavement reflections and light rain atmospheric effects.
- − The man's face is obscured and his anatomy/posture feels awkward and less detailed than Model A.
- − The bicycle frame has some structural AI artifacts where it meets the man's hands.
- − The lighting is a bit flat compared to the cinematic depth of the first image.
Verdict: Gemini 3 Pro Image Preview produces a significantly more high-quality and realistic image with incredible detail in the subject's face and the mechanical parts of the bike. While Grok Imagine followed the 'motion blur' instruction more literally, it suffered from poorer subject clarity and internal coherence. Gemini 3 Pro is the winner for its superior photographic realism and convincing street atmosphere.
Modern Clean Menu
Text-to-Image“Modern minimalist restaurant menu design, white background with colorful food photos in grid, sections for appetizers/pizza/mains, bold sans-serif fonts, vibrant accents, clean professional layout for casual dining.”
AI Judge Analysis
Nano Banana Pro
- + Perfectly adheres to the grid layout request for food photos.
- + Excellent font clarity and professional typographic hierarchy.
- + Coherent and logical sections for Appetizers, Pizza, and Mains.
- − Repeats the same menu items (e.g., Bruschetta) multiple times in the list.
- − Included burgers in the food photos which weren't explicitly requested.
Grok Imagine Image
- + More varied and artistic placement of food images.
- + Includes more unique dish names compared to Model A.
- − Failed the 'grid' layout requirement for food photos.
- − The fonts are less legible and more prone to AI hallucination artifacts.
- − The layout feels cluttered and less like a 'minimalist' design.
Verdict: Gemini 3 Pro much better understands the principles of graphic design, delivering a clean, professional minimalist layout with a clear grid system as requested. While Grok Imagine has more variety in its menu items, its failure to produce a grid and its messy typography make it less effective for a professional menu design.
Fantasy Warrior
Text-to-Image“Close portrait of a battle-worn paladin in ornate engraved plate armor, hair braided with small beads, faint scars and dirt on the skin, warm torchlight reflecting off metal, shallow depth of field, bokeh sparks, lifelike eyes, highly detailed texture on leather straps and cloth underlayer.”
AI Judge Analysis
Nano Banana Pro
- + Excellent adherence to 'battle-worn' with realistic dirt, grime, and skin texture.
- + The engraving on the armor is intricate and feels part of the metal surface.
- + Highly realistic lighting and sparks that integrate naturally with the subject.
- − The hair braids are a bit thicker than the typical 'small beads' request, though still present.
- − The armor plating on the left shoulder (viewer's right) has slightly inconsistent rivet placement.
Grok Imagine Image
- + Exquisite and fine engraving detail across the entire plate armor set.
- + Successful implementation of shallow depth of field and bokeh elements.
- + Follows the bead and braid request accurately.
- − The skin appears too clean and airbrushed for a 'battle-worn' paladin, despite the superficial scratches.
- − The torches look somewhat artificial, with a flat digital glow rather than naturalistic light.
- − The hair strands floating on the left side (viewer's left) look like digital artifacts.
Verdict: Gemini 3 Pro Image Preview delivers a more convincing 'battle-worn' character with grit and realistic skin texture that matches the paladin archetype better than the more polished, 'fashion-model' look of Grok Imagine. While Grok Imagine has beautiful armor engravings, it fails to capture the weathered storytelling requested in the prompt as effectively as Gemini.
Man and Car in California
Editing“Make a photo of the man driving the car down the California coastline”
AI Judge Analysis
Nano Banana Pro
- + Excellent source preservation of the man's facial features and specific hairstyle.
- + Faithfully maintained the man's clothing including the plaid coat and scarf.
- + Correctly merged the car model from the first source image into a new environment.
Grok Imagine Image
- + Great scenic composition of the California coastline.
- + Good sense of motion with the blurred road and wheels.
- − Completely failed to use the man from the source image, replacing him with a generic figure.
- − The car model has been significantly altered compared to the source image.
Verdict: Gemini 3 Pro Image Preview successfully completed the image editing task by preserving the identifiable characteristics of both the man and the car from the source images. Grok Imagine failed the core editing requirement by generating a completely different man and a modified car, effectively ignoring the provided source material in favor of a new generation.
Bald man challenge
Editing“Give the person a full, thick head of natural hair with realistic texture, density, and a natural hairline. Preserve facial features and lighting.”
AI Judge Analysis
Nano Banana Pro
- + Excellent preservation of the source image pixels outside the edit area
- + Realistic wavy texture and volume
- + Matches the lighting and grain of the original photo perfectly
- − The hairline is a bit too straight and high, making it look slightly like a hairpiece
Grok Imagine Image
- + Very natural integration of the hair with the existing sideburns
- + Realistic hair density and natural-looking hairline with fine strands
- + Perfect preservation of facial features and environment
- − None notable
Verdict: Both models performed exceptionally well at local editing, preserving the original person, clothing, and background exactly. Grok Imagine (Model B) is the winner because the hairstyle it generated looks more organic and better integrated with the existing sideburns, whereas Gemini 3 Pro (Model A) produced a hairline that feels slightly detached from the scalp.
Isometric Miniature Diorama Scenes
Text-to-Image“Create a clear, 45° top-down isometric miniature 3D cartoon scene of Japan's signature dish: sushi, with soft refined textures, realistic PBR materials, gentle lighting, on a small raised diorama base with minimal garnish and plate. Solid light blue background. At top-center: 'JAPAN' in large bold text, 'SUSHI' below it, small flag icon. Perfectly centered, ultra-clean, high-clarity, square format.”
AI Judge Analysis
Nano Banana Pro
- + Excellent adherence to the 'PBR materials' and 'soft refined textures' request, with realistic rice and fish textures.
- + Higher complexity in the diorama base, including moss and blossom details that enhance the miniature aesthetic.
- + Perfect text rendering and layout that matches the 'bold text' and 'flag icon' instructions.
- − The text 'JAPAN' is slightly off-center compared to the rest of the composition.
Grok Imagine Image
- + Very clean, high-contrast isometric design that strictly follows the 45-degree angle.
- + Bold and clear text rendering that is perfectly centered.
- + Good use of vibrant colors that pop against the light blue background.
- − Materials look more like plastic/clay than the requested 'realistic PBR' textures.
- − The lighting is a bit harsh, lacking the 'gentle' quality requested in the prompt.
- − The sushi variety is less detailed and realistic than the competing image.
Verdict: Gemini 3 Pro Image Preview is the winner as it much more effectively captured the 'realistic PBR materials' and 'soft refined textures' requested in the prompt, resulting in a premium miniature look. While Grok Imagine produced a clean and functional isometric graphic, its textures appear flat and plastic-like compared to the intricate detail of the rice and fish in Gemini's output.
Night Sky Transformation
Editing“Change the scene to night: a deep, dark sky with subtle, glistening stars visible behind the mountain.”
AI Judge Analysis
Nano Banana Pro
- + Perfectly preserves all town geometry and terrain details from the source image
- + Accurately represents the request for a deep, dark sky with glistening stars
- + Maintains the lighting on the landscape consistent with a moonless night
- − Minimal loss of detail in the darkest shadows on the left hillside
Grok Imagine Image
- + Excellent sky detail with realistic star density
- + Good preservation of the town layout
- − Substantially alters the lighting on the Matterhorn peak relative to the source image
- − Slightly shifts some of the lower terrain features compared to the original
Verdict: Gemini 3 Pro is the clear winner as it perfectly preserves the structural integrity and fine details of the town and landscape from the source image while seamlessly swapping the golden hour sky for a night sky. Grok Imagine does a good job with the night atmosphere but makes more changes to the light distribution on the mountain faces and slightly alters some foreground elements.
Over-the-top cartoon caricature
Editing“Create a caricature of me and my job. Make it exaggerated and humorous, incorporating my profession as a tv show anchor and my love for dogs and hockey.”
AI Judge Analysis
Nano Banana Pro
- + Excellent adherence to the 'exaggerated and humorous' aspect of a caricature.
- + Effectively combines all three requested themes: TV anchor, dogs, and hockey in a busy, fun scene.
- + Maintains character likeness while translating it into a bold, consistent illustrative style.
- − The denim jacket and hockey jersey under-layer feel a bit cluttered compared to a traditional news anchor look.
- − The character's expression is slightly more manic than the original friendly smile.
Grok Imagine Image
- + Captures the 'big head' caricature style perfectly while keeping the face very close to the source photo.
- + High visual quality with clean rendering and professional TV studio lighting.
- + Includes subtle, witty details like 'Sports Scoop' on the notes and a dog on skates.
- − Fewer dogs and less hockey-themed clutter make it feel slightly less 'exaggerated' than Model A.
- − The background hockey rink screen is a bit generic.
Verdict: Gemini 3 Pro Image Preview and Grok Imagine both handled the request well, but with different approaches to the 'caricature' style. Gemini 3 Pro Image Preview went for a full illustrative transformation that is highly creative and dense with humorous details, whereas Grok Imagine used a 'big head' 3D-render style that preserved the facial identity of the subject much more accurately while still delivering a professional news anchor aesthetic.
Adorable Baby Animals in Sunny Meadow
Text-to-Image“Hyper-photorealistic scene of fluffy baby animals—a golden retriever puppy, tabby kitten, baby bunny, and red fox kit—with big expressive eyes and ultra-detailed soft fur, playfully chasing butterflies and tumbling together in a lush wildflower meadow, warm golden sunrise light with god rays and dew sparkles, joyful wholesome vibe, 8K masterpiece.”
AI Judge Analysis
Nano Banana Pro
- + Captures the 'chasing' and 'tumbling' action better with dynamic poses.
- + Excellent fur texture and lighting integration on all animals.
- + Superior background composition with distinct god rays and a lush meadow.
- − The butterfly is slightly too large relative to the animals.
- − The puppy's front right paw is a bit anatomically blurry.
Grok Imagine Image
- + Very cute, stylized character designs with expressive eyes.
- + Strong golden hour lighting effect with high contrast.
- + Clean, vibrant colors in the wildflowers.
- − Animals are mostly static and sitting, failing the 'chasing and tumbling' part of the prompt.
- − The 'butterflies' look more like small indistinct insects or bees.
- − The fox kit's anatomy is a bit unusual, and the kitten's eyes are unnaturally vibrant blue.
Verdict: Gemini 3 Pro Image Preview is the winner as it successfully captured the motion and energy requested in the prompt, showing the animals actually playing and chasing. While Grok Imagine produced a very cute 'family portrait' style image, it missed the core action of the prompt and the animal features appear more AI-generated and less realistic.
Victorian Greenhouse Oasis
Text-to-Image“Hyper-photorealistic interior of a lush Victorian glass greenhouse filled with exotic tropical plants, vibrant blooming orchids, tall ferns, colorful butterflies in flight, sunlight filtering through ornate glass roof creating realistic caustics and dew on leaves, intricate iron framework visible, misty atmosphere, 8K masterpiece.”
AI Judge Analysis
Nano Banana Pro
- + Excellent depiction of the intricate Victorian ironwork and glass roof architecture.
- + Highly realistic density of tropical foliage, orchids, and ferns.
- + Subtle and professional lighting with a convincing misty atmosphere.
- − The scale of some butterflies feels slightly inconsistent with the depth of the room.
Grok Imagine Image
- + Strong volumetric lighting and god rays that emphasize the sunlight filter.
- + Clear, vibrant colors on the butterflies and foreground orchids.
- − The butterflies appear flat and pasted-on rather than integrated into the 3D space.
- − The overall image looks more like a digital illustration than the requested hyper-photorealistic masterpiece.
- − Several butterflies have anatomically strange or merging wings.
Verdict: Gemini 3 Pro Image Preview perfectly captures the 'hyper-photorealistic' and 'Victorian' aspects of the prompt, resulting in a cohesive, atmospheric scene with sophisticated architectural detail. Grok Imagine creates a more vibrant, illustrative look, but it fails on realism, particularly with the butterflies and the lighting integration. Gemini's superior handling of texture, mist, and botanical variety makes it the clear winner.
Heroic Super Hero Portrait
Text-to-Image“Hyper-photorealistic full-body portrait of a female superhero standing triumphantly on a New York skyscraper rooftop at golden sunset, wearing a classic modest superhero costume with flowing cape, chest emblem, gloves, and boots in red and blue colors, practical design, short hair, strong determined heroic expression looking into the distance, powerful confident stance with hands on hips and cape billowing dramatically in the wind, detailed urban cityscape background, warm natural sunlight with sharp shadows and fabric highlights, ultra-sharp textures on suit, hair, and concrete, 8K masterpiece, empowering family-friendly style.”
AI Judge Analysis
Nano Banana Pro
- + Excellent photographic quality with natural skin textures and cloth physics.
- + Exceptional background detail showing a realistic New York skyline with depth.
- + Subtle and sophisticated lighting that realistically integrates the character into the scene.
- − Missed the 'hands on hips' instruction, opting for a relaxed arm posture.
- − The character's expression is more contemplative than 'strong and determined'.
Grok Imagine Image
- + Successfully captured the 'hands on hips' pose requested in the prompt.
- + Dramatic, high-contrast lighting that emphasizes the triumphant theme.
- + Strong adherence to the 'billowing cape' and profile-view heroic expression.
- − The cityscape looks somewhat generic and flat compared to Model A.
- − The costume emblem is a derivative of the Superman 'S', which feels less original.
- − The lighting on the character feels a bit like a studio light rather than natural sun.
Verdict: Gemini 3 Pro Image Preview provides a much more realistic and detailed environment, with superior texture and lighting that feels truly photorealistic. However, Grok Imagine followed the specific posing instructions (hands on hips) much better and captured a more conventionally 'heroic' silhouette despite the simpler background.
Studio Ghibli Anime Style
Editing“Transform this photo into a Studio Ghibli–inspired illustration. Use soft pastel colors, hand-painted textures, gentle lighting, dreamy backgrounds, and a warm, nostalgic mood”
AI Judge Analysis
Nano Banana Pro
- + Captures the distinctive 'Ghibli' line art style more accurately, especially in the hair and facial features.
- + Successfully interprets the background into a charming, hand-painted European street with a tram.
- + Preserves the original poses and relative positions of the characters with high fidelity.
- − The colors are slightly more saturated than the requested 'soft pastel' palette.
- − The facial expressions are a bit softened, losing some of the sharp 'indignant' energy of the woman on the right.
Grok Imagine Image
- + Excellent use of soft pastel colors and painterly textures, particularly in the sky and background buildings.
- + Maintains a strong sense of the original image's character expressions while translating them to anime style.
- + Achieves a warm, nostalgic mood through lighting and color grading.
- − The hand and hair of the woman on the right are somewhat simplified and less defined than the source.
- − The background elements are a bit more cluttered compared to the clean layout of the source image.
Verdict: Both models performed exceptionally well at translating a famous meme into a specific art style while maintaining character recognition. Gemini 3 Pro Image Preview captures the specific Ghibli character design (line weight and eye shape) more effectively, whereas Grok Imagine excels at the requested pastel color palette and hand-painted texture. Gemini 3 Pro is preferred for a more polished and structurally accurate transformation.
Neutral Expression to Genuine Smile
Editing{
"action": "image_edit",
"reference": "uploaded neutral portrait",
"change": "Warm genuine Duchenne smile: lips curved up, slight natural teeth, soft eye crinkles, subtle cheek raise",
"details": "Realistic smiling skin (dimples if present, soft cheek shadows), slightly brighter eyes; keep exact eye shape/color/iris",
"preserve_exact": "Face identity/structure, eyes/nose/lips/eyebrows, hair, skin texture/pores/freckles, makeup, clothing, head pose, background, lighting, shadows, framing",
"no_changes": "No face shape change, no new features, no gaze shift, no hair/clothing/lighting/background edits",
"style": "Ultra-photorealistic 8K portrait, sharp face focus, natural soft lighting, realistic skin glow"
}
AI Judge Analysis
Nano Banana Pro
- + Perfect preservation of original hair, clothing, and background.
- + Highly accurate eye crinkles and facial muscle movement for a genuine Duchenne smile.
- + Maintains original skin texture, freckles, and moles with zero degradation.
Grok Imagine Image
- + Successfully applies a realistic smile with natural tooth exposure.
- + Preserves the overall identity and composition of the subject well.
- − Slightly softens the skin texture compared to the source, losing some fine detail.
- − The eye crinkles are less pronounced than requested for a full Duchenne smile.
- − Minor smoothing of the hair texture compared to the original.
Verdict: Gemini 3 Pro Image Preview is the winner because it achieved a perfect 1:1 preservation of all non-edited pixels (hair, background, skin freckles) while flawlessly executing the facial expression change. Grok Imagine Image produced a high-quality edit, but it introduced subtle smoothing to the skin and hair, failing to maintain the exact 'preserve_exact' instructions as precisely as Gemini.
Golden Hour Stroll
Editing“Add dynamic motion to this photo: make hair blow in the wind, add leaves flying, energetic and lively feel.”
AI Judge Analysis
Nano Banana Pro
- + Excellent hair physics that realistically depicts hair blowing in the wind.
- + Very high source preservation, maintaining the original face, dog, and background details exactly.
- + The leaves are varied in color and size, adding to the dynamic feel.
- − Some of the added leaves appear a bit sharp and lack motion blur.
Grok Imagine Image
- + Successfully added hair movement and many flying leaves.
- + Good overall preservation of the main subjects and layout.
- − The hair edit is less natural than Model A, with some strands appearing stiff.
- − The leaves are very repetitive in shape and color, looking like a digital overlay.
- − Subtle changes to the dog's face and fur texture compared to the source.
Verdict: Gemini 3 Pro Image Preview is the winner because it provides a more realistic interpretation of 'hair blowing in the wind' and maintains better source preservation. While Grok Imagine Image adds more leaves, they feel like repetitive clip-art overlays, whereas Gemini's edits feel integrated into the physics of the scene.
Vintage Cafe Logo
Text-to-Image“Vintage minimalist restaurant logo for "Caffè Florian", retro cloche dome with steam and "Est. 1720" banner, classic typography, warm brown and cream tones, subtle texture on light background, vector emblem style.”
AI Judge Analysis
Nano Banana Pro
- + Perfect text rendering for both the name and the banner
- + Beautiful hand-drawn vintage engraving style with cross-hatching
- + Single, clean banner placement following the prompt instructions
- − The 'steam' swirls are slightly more illustrative than a classic minimalist logo
Grok Imagine Image
- + Strong minimalist vector aesthetic
- + Clean, high-contrast color palette
- − Repeats 'Est. 1720' twice, creating redundant information
- − The cloche is oddly merged with a spoon and cup handle, creating a confused silhouette
- − The 'steam' is stylistically inconsistent with the rest of the vector art
Verdict: Gemini 3 Pro Image Preview delivered a much more coherent design that strictly followed all prompt instructions without adding redundant information. It captured the 'vintage' and 'banner' elements perfectly, whereas Grok Imagine had logical issues with the graphic elements (merging the cloche into other kitchenware) and repeated the establish date twice.
Apollo 11: Journey to Tranquility
Text-to-Image“Create a clean, modern vector infographic poster about the Apollo 11 mission. NASA-inspired palette (navy, white, muted red, light gray). Flat-vector style, crisp lines, consistent iconography, subtle gradients only. Steps (stop at landing): 1. Launch (Saturn Vicon) 2. Earth Orbit (Earth + orbit ring icon) 3. Translunar (trajectory arc icon) 4. Lunar Orbit (Moon + orbit ring icon) 5. Descent (lunar module descending icon) 6. Landing (lunar module on the surface icon) Small supporting elements (minimal text): • Crew strip: three silhouette icons with only last names: Armstrong, Aldrin, Collins. • Landing site marker: Moon pin labeled "Tranquility" only. Layout constraints: generous margins, large readable labels, clean background with subtle stars. Vector-only, print-poster look, high resolution.”
AI Judge Analysis
Nano Banana Pro
- + Excellent typography with nearly perfect spelling across all labels.
- + Cohesive and logical visual flow using a continuous trajectory line.
- + Professional vector aesthetic that matches the 'modern infographic' request perfectly.
- − Small gibberish text on the trajectory line itself.
- − The lunar module in 'Descent' is missing its descent stage (the gold base).
Grok Imagine Image
- + Accurately depicts the NASA logo and color palette.
- + Includes distinct icons for all requested steps.
- + Good use of flat vector style with clean shapes.
- − Numerous spelling errors including '3rajoory', 'Transluioory', and 'Moom'.
- − Disjointed composition with icons scattered rather than showing a procedural flow.
- − The trajectory arc for 'Translunar' is a literal archway shape rather than a space flight path.
Verdict: Gemini 3 Pro Image Preview provides a far superior infographic with a logical visual flow and high-quality typography that makes it actually functional. Grok Imagine captures the flat vector style and the NASA palette well, but it suffers from severe spelling errors and a disjointed layout that fails to convey the mission steps as a sequence.
Nano Banana Pro
Gemini 3 Pro with image generation capabilities. Combines advanced reasoning with the ability to generate and edit images.
Grok Imagine Image
An image generation model by xAI designed to generate highly aesthetic images from text descriptions.