Nano Banana 2 vs Grok Imagine Image Pro

Head-to-head across 16 challenges

Nano Banana 2

70.6%

win rate

Ties

5.9%

Grok Imagine Image Pro

23.5%

win rate

70.6% 5.9% ties 23.5%

Challenge Results

Geometric Composition

Text-to-Image

“A glass cube on a wooden table. Inside the cube is a small blue sphere. On top of the cube sits a red book. A green plant is behind the cube, partially visible through the glass. Soft window light from the left.”

Nano Banana 2
Grok Imagine Image Pro

AI Judge Analysis

Nano Banana 2

  • + Excellent text rendering on the book spine.
  • + Highly realistic glass transparency and perspective.
  • + Matches all prompt elements including the specific lighting direction.
  • The glass cube is more of a hollow frame/aquarium shape than a solid cube.

Grok Imagine Image Pro

  • + Captures the 'glass cube' aesthetic well with thick glass walls.
  • + Clean, minimalistic composition.
  • + Good depth and blur on the background plant.
  • Includes a strange duplicate reflection/half-sphere on the right side of the cube.
  • The sphere has a matte texture rather than the more common glass/marble look.
  • Perspective of the book on the cube is slightly warped at the front edge.

Verdict: Nano Banana 2 produces a significantly more realistic image with impressive text rendering and natural-looking lighting. Grok Imagine Image Pro suffers from a strange visual artifact where a second partial blue sphere appears inside the glass, detracting from its overall quality.

Candid Street Photography

Text-to-Image

“A candid street photo of an elderly Japanese man repairing a red bicycle in light rain, reflections on wet pavement, shallow depth of field, 50mm lens, natural skin texture, imperfect framing, motion blur from passing cars, cinematic but realistic, no stylization.”

Nano Banana 2
Grok Imagine Image Pro

AI Judge Analysis

Nano Banana 2

  • + Excellent photorealistic texture and lighting that feels like a real film photograph.
  • + Very strong adherence to the 'reflections on wet pavement' and 'candid street photo' feel with vibrant neon signs.
  • + High technical accuracy in showing the bike chain and tools on the ground.
  • The man's right hand holding the wrench is anatomically jumbled and merged with the tool.

Grok Imagine Image Pro

  • + Good inclusion of light rain streaks and motion blur on the background car lights.
  • + Clear composition with a nice shallow depth of field.
  • The background and pavement look a bit more CGI/rendered compared to the grit of Model A.
  • The bike's kickstand and parts of the frame appear somewhat unnatural or floating.
  • The skin texture on the man's face is slightly too smooth and lacks 'natural skin texture' detail.

Verdict: Nano Banana 2 produces a significantly more realistic and cinematic image that stays true to the gritty, detailed aesthetic of Japanese street photography. While it has a slight anatomical error in the hand, its environmental textures and lighting are far superior to Grok Imagine Image Pro, which looks somewhat cleaner and more digital.

Fantasy Warrior

Text-to-Image

“Close portrait of a battle-worn paladin in ornate engraved plate armor, hair braided with small beads, faint scars and dirt on the skin, warm torchlight reflecting off metal, shallow depth of field, bokeh sparks, lifelike eyes, highly detailed texture on leather straps and cloth underlayer.”

Nano Banana 2
Grok Imagine Image Pro
0% wins 0% ties 100% wins

AI Judge Analysis

Nano Banana 2

  • + Excellent adherence to the 'battle-worn' descriptor with heavy dirt and sweat texture on the face.
  • + Highly detailed engraving and rune work on the plate armor.
  • + The inclusion of the sword hilt and hand adds to the paladin warrior narrative.
  • The hand gripping the sword has some anatomical awkwardness in the finger proportions and lighting.
  • The background torchlight is a bit blown out compared to the rest of the scene.

Grok Imagine Image Pro

  • + Superb text rendering on the gorget ('Lux in tenebris') which fits the paladin theme perfectly.
  • + Very clean and symmetrical braid/bead work that clearly follows the prompt.
  • + Exceptional skin texture and lifelike eyes with realistic catchlights.
  • The character looks more 'gritty fashion' than truly 'battle-worn' compared to the heavier weathering in the other image.
  • The composition is slightly more static and centered.

Verdict: Both models followed the prompt exceptionally well, but Grok Imagine Image Pro wins due to the superior clarity of its textures and the impressive inclusion of legible thematic text on the armor. While Nano Banana 2 captures a more intense 'battle-worn' atmosphere with grittier skin textures, Grok Imagine Image Pro offers better overall visual coherence and more precise detail in the hair beads and engraving.

Modern Clean Menu

Text-to-Image

“Modern minimalist restaurant menu design, white background with colorful food photos in grid, sections for appetizers/pizza/mains, bold sans-serif fonts, vibrant accents, clean professional layout for casual dining.”

Nano Banana 2
Grok Imagine Image Pro
80% wins 0% ties 20% wins

AI Judge Analysis

Nano Banana 2

  • + Complete menu functional design with text, prices, and descriptions.
  • + Excellent adherence to the requested sections (Appetizers, Pizza, Mains).
  • + Highly professional layout that looks like a real-world restaurant asset.
  • Small text descriptions contain some minor spelling artifacts.
  • The grid of food photos is slightly less uniform in lighting than Model B.

Grok Imagine Image Pro

  • + Extremely clean and consistent food photography in a perfect 3x3 grid.
  • + High visual clarity and vibrant colors.
  • + Strictly minimalist aesthetic.
  • Fails to include actual menu content like item names, descriptions, or prices.
  • The layout is more of a mood board or category header than a functional restaurant menu.

Verdict: Nano Banana 2 produces a fully realized, professional restaurant menu including typography, branding, and pricing, which makes it much more useful for the prompt's intent. Grok Imagine Image Pro creates a beautiful grid of food photos, but fails to provide the textual components of a menu design, leaving it looking like an incomplete template.

Chalkboard Menu

Text-to-Image

“Handwritten-style chalkboard menu in a cozy café, all text rendered in the exact same realistic chalk handwriting style with natural variations in letter size, slight slant, and chalk texture — no printed or digital fonts anywhere on the board. Title at the top in elegant cursive chalk handwriting: ‘TODAY’S SPECIALS – APRIL 30, 2026’. Below it, three menu items also in the same handwritten chalk style: ‘Truffle Mushroom Risotto – $24’, ‘Grilled Octopus with Lemon & Herbs – $28’, ‘Brown Butter Chocolate Chip Cookies – $9’. At the very bottom, smaller text in the identical handwritten chalk style (slightly smaller but still clearly legible with the same handwriting characteristics): ‘All items made fresh daily • Ask about our gluten-free options’. Warm ambient café lighting, visible chalk dust and smudges, realistic handwriting imperfections, no clean printed text anywhere.”

Nano Banana 2
Grok Imagine Image Pro

AI Judge Analysis

Nano Banana 2

  • + Excellent background atmosphere and depth of field
  • + Strong chalk texture realism with smudges and dust
  • + Perfect text accuracy and spelling
  • The mix of different script styles makes it look like it was written by multiple people rather than one consistent hand as requested

Grok Imagine Image Pro

  • + More consistent handwriting style across the whole board
  • + Excellent chalk texture on the individual letters
  • + Perfect text accuracy and spelling
  • Simple, flat composition compared to the other model
  • The board appears slightly crooked or distorted at the edges

Verdict: Both models followed the complex text instructions perfectly, which is impressive. Nano Banana 2 has superior visual quality and environmental storytelling with its cozy café background, while Grok Imagine Image Pro adhered better to the request for a consistent handwriting style throughout the board.

Pose & Character Mashup

Editing
Edit instruction

“Use Image 1 as the exact pose reference and Image 2 as the character reference. Recreate the person/character from Image 2 in the exact dynamic pose and body position from Image 1. Keep the exact face, hair, clothing style/details, and expression from Image 2. Match the lighting and environment of Image 1. The final image must show the character from Image 2 performing the precise action/pose from Image 1 with perfect anatomy and natural integration.”

Source
Nano Banana 2
Grok Imagine Image Pro
0% wins 100% ties 0% wins

AI Judge Analysis

Nano Banana 2

  • + Excellent character preservation including face, sunglasses, scarf, and clothing details
  • + Faithfully recreates the yellow background and red ottoman from Image 1
  • + Successfully adapts the character's attire to the dynamic movement
  • Anatomical failure with a foot protruding from the chest area
  • The hand placement on the ottoman does not match the original foot placement requested for the 'exact' pose
  • Physical weight and balance of the figure appear floating rather than grounded

Grok Imagine Image Pro

  • + Very high visual clarity and cleanliness
  • + Perfect anatomical structure and balance
  • + Maintains the exact pose, lighting, and composition of the source image
  • Complete failure to use the character from Image 2
  • Ignores the request for the sunglasses, scarf, and male character
  • Simply replaces the original woman with a slightly different woman instead of the requested subject

Verdict: Nano Banana 2 attempted the complex task of merging the character from Image 2 into the pose of Image 1, succeeding in character likeness but failing significantly on anatomy with a floating foot and broken proportions. Grok Imagine Image Pro completely ignored the character reference, essentially reproducing the source pose with a generic subject. Nano Banana 2 is the winner because it actually followed the core instruction of the multi-image edit, despite the anatomical artifacts.

The Reversed Rodeo

Text-to-Image

“Horse riding astronaut in space — horse on top, not vice versa. Surreal, highly detailed, cinematic.”

Nano Banana 2
Grok Imagine Image Pro

AI Judge Analysis

Nano Banana 2

  • + Excellent adherence to the 'horse riding astronaut' concept with the horse physically positioned on the astronaut's back.
  • + High-quality textures on both the horse muscle definition and the detailed space suit.
  • + Creative addition of a cosmic saddle and reins held by the astronaut.
  • The horse's front hoof is clipping awkwardly into the astronaut's helmet visor.
  • Wait, the horse has an extra leg appearing near the rear.

Grok Imagine Image Pro

  • + Successfully positions the horse on top of the astronaut as requested.
  • + Vibrant color palette with a striking planetary background.
  • + Clean composition with good separation between the subjects and the nebulas.
  • The horse appears to be floating just above the astronaut rather than 'riding' him.
  • The horse's anatomy is slightly distorted, specifically where the hind legs meet the torso.

Verdict: Both models successfully followed the counter-intuitive instruction of placing the horse on top of the astronaut. Nano Banana 2 provided a more literal interpretation of 'riding' by including a saddle and reins, though it suffered from anatomical glitches like an extra leg. Grok Imagine Image Pro produced a cleaner, more cinematic aesthetic but the horse and astronaut feel less physically connected.

Outfit Transfer Challenge

Editing
Edit instruction

“Use Image 1 as the base person. Dress them in the exact elaborate outfit from Image 2 (including all layers, accessories, jewelry, and shoes). Carefully adapt the clothing to the body shape and pose in Image 1 while maintaining realistic fabric behavior, correct proportions, and perfect lighting/shadow matching. Keep the person’s exact face, hair, and background completely unchanged.”

Source
Nano Banana 2
Grok Imagine Image Pro

AI Judge Analysis

Nano Banana 2

  • + Perfectly replicates the specific clothing, scarf, and accessories from Image 2.
  • + Maintains the subject's vitiligo skin patterns correctly across the hands and face.
  • + Successfully preserves the background and the subject's identity while integrating the new clothes naturally.
  • The sunglasses are added slightly awkwardly over the sand on the face.

Grok Imagine Image Pro

  • + Maintains the face and background of Image 1 accurately.
  • Completely fails to use the clothing from Image 2, instead generating a generic regal outfit.
  • Does not preserve the vitiligo pattern on the hands, which now appear as a single skin tone.
  • The lighting on the body does not match the harsh natural light of the beach background.

Verdict: Nano Banana 2 followed the instructions perfectly, successfully transplanting the specific outfit, scarf, and sunglasses from Image 2 onto the person in Image 1 while maintaining their unique skin patterns. Grok Imagine Image Pro completely ignored the visual reference of the clothing, generating an unrelated 'elaborate' costume and failing to preserve the subject's skin details on their hands.

The Capybara Taxi Driver

Text-to-Image

“Photorealistic scene inside a yellow New York taxi at night. A capybara is driving, wearing a yellow taxi driver cap and a dark jacket. It has a calm, professional expression and both front paws on the steering wheel. In the back seat sits a human businesswoman in a coat, looking at her phone with a completely normal, bored expression (as if this is just another normal ride). Through the windows you can see the streets of Manhattan at night with blurred lights. Realistic taxi interior, photorealistic, detailed fur and fabric, 35mm lens, night lighting with reflections, shallow depth of field.”

Nano Banana 2
Grok Imagine Image Pro

AI Judge Analysis

Nano Banana 2

  • + Excellent photorealistic lighting and textures inside the taxi cab.
  • + Very detailed city backgrounds with recognizable locations like Radio City.
  • + High resolution rendering of the capybara's fur and clothing.
  • Failed to include the human businesswoman in the back seat as requested.
  • The capybara's paws are merged awkwardly with the steering wheel.

Grok Imagine Image Pro

  • + Successfully included all prompt elements including the bored businesswoman and the capybara driver.
  • + Great composition that clearly shows the dynamic between the driver and the passenger.
  • + Text on the hat is legible and contextually accurate (NYC TLC Medallion).
  • The capybara's paws look more like talons/claws than actual paws.
  • The lighting on the businesswoman is slightly flat compared to the driver.

Verdict: While Nano Banana 2 has slightly superior texture work and background detail, it failed a major part of the prompt by omitting the passenger. Grok Imagine Image Pro followed all instructions perfectly, creating a humorous and well-composed scene that captured the 'bored expression' of the businesswoman sitting behind the capybara.

Bald man challenge

Image Editing
Edit instruction

“Give the person a full, thick head of natural hair with realistic texture, density, and a natural hairline. Preserve facial features and lighting.”

Before After
Nano Banana 2
Before After
Grok Imagine Image Pro
75% wins 0% ties 25% wins

AI Judge Analysis

Nano Banana 2

  • + Seamless integration of hair with the existing sideburns and beard
  • + Excellent preservation of the original head shape and facial features
  • + Very realistic, rough texture that matches the character's aesthetic
  • The hairline is slightly high, though physically plausible

Grok Imagine Image Pro

  • + Natural-looking hair texture and logical growth direction
  • + Good preservation of the original environment and clothing
  • Noticeable distortion of the skull shape, making the forehead appear slightly indented
  • The hair placement feels a bit like a 'topper' rather than a natural extension of the scalp

Verdict: Both models did an excellent job of matching the lighting and texture of the original image. Nano Banana 2 is the winner because it maintained the correct anatomical structure of the subject's head, whereas Grok Imagine Image Pro slightly flattened the top of the forehead, creating a less natural transition.

Isometric Miniature Diorama Scenes

Text-to-Image

“Create a clear, 45° top-down isometric miniature 3D cartoon scene of Japan's signature dish: sushi, with soft refined textures, realistic PBR materials, gentle lighting, on a small raised diorama base with minimal garnish and plate. Solid light blue background. At top-center: 'JAPAN' in large bold text, 'SUSHI' below it, small flag icon. Perfectly centered, ultra-clean, high-clarity, square format.”

Nano Banana 2
Grok Imagine Image Pro
100% wins 0% ties 0% wins

AI Judge Analysis

Nano Banana 2

  • + Excellent text rendering with 'JAPAN' and 'SUSHI' clearly legible and well-positioned.
  • + Highly detailed and realistic sushi variety, including accurate textures for roe, eel, and sashimi.
  • + Beautiful diorama base featuring moss, rocks, and a secondary wooden platter that adds depth.
  • Lean more towards realism than the requested '3D cartoon' aesthetic.
  • The flag icon is placed to the left of 'SUSHI' rather than below it as implied by standard layout hierarchy.

Grok Imagine Image Pro

  • + Perfectly captures the '3D cartoon' style with soft, rounded, and playful textures.
  • + Minimalist and ultra-clean composition that adheres strictly to the 'minimal garnish' request.
  • + Center-aligned text layout including the flag icon is very balanced.
  • The sushi pieces are repetitive and lack the intricate variety seen in the other model.
  • The wooden base texture is somewhat simple compared to the detailed diorama requested.

Verdict: Nano Banana 2 produces a stunningly detailed diorama with high-quality PBR materials and perfect text, though it leans more towards realism. Grok Imagine Image Pro better captures the '3D cartoon' aesthetic with soft, clean shapes, but lacks the intricate detail and variety provided by the former. Nano Banana 2 is the preferred choice for its superior visual complexity and professional finish while still meeting all text and layout requirements.

Over-the-top cartoon caricature

Editing
Edit instruction

“Create a caricature of me and my job. Make it exaggerated and humorous, incorporating my profession as a tv show anchor and my love for dogs and hockey.”

Source
Nano Banana 2
Grok Imagine Image Pro

AI Judge Analysis

Nano Banana 2

  • + Excellent caricature style with hand-drawn colored pencil textures.
  • + Clever wordplay in the text including 'W-K9 NEWS' and 'Anchor's Puck Drop'.
  • + Captures the subject's likeness effectively within the stylized caricature.
  • The hand holding the physical card is an unnecessary meta-element.
  • A few minor floating microphone artifacts in the background.

Grok Imagine Image Pro

  • + Strong incorporation of all elements including a hockey trophy and multiple dogs.
  • + The facial caricature is very exaggerated and humorous as requested.
  • + Clean, professional digital illustration style with good text rendering.
  • The likeness is slightly more generic than Model A.
  • Some repetitive elements like the grid of identical golden retriever puppies.

Verdict: Both models followed the instructions perfectly, creating humorous caricatures that blend the subject's career and hobbies. Nano Banana 2 stands out for its creative wordplay and more traditional artistic texture, while Grok Imagine Image Pro provides a more polished digital look with a wider variety of background details.

Adorable Baby Animals in Sunny Meadow

Text-to-Image

“Hyper-photorealistic scene of fluffy baby animals—a golden retriever puppy, tabby kitten, baby bunny, and red fox kit—with big expressive eyes and ultra-detailed soft fur, playfully chasing butterflies and tumbling together in a lush wildflower meadow, warm golden sunrise light with god rays and dew sparkles, joyful wholesome vibe, 8K masterpiece.”

Nano Banana 2
Grok Imagine Image Pro
100% wins 0% ties 0% wins

AI Judge Analysis

Nano Banana 2

  • + Perfect adherence to the animal count, featuring exactly one of each requested species.
  • + Dynamic and realistic movement with a sense of playful chasing across the meadow.
  • + Excellent lighting effects with subtle god rays and morning dew consistent with the prompt.
  • The butterfly on the dog's tail is a bit static and looks slightly pasted on.

Grok Imagine Image Pro

  • + Very expressive and cute facial expressions on the puppy and fox.
  • + Vibrant colors with high contrast in the wildflower patches.
  • + Includes more butterflies to enhance the 'chasing' narrative.
  • Failed the prompt count by including two tabby kittens instead of one.
  • The fox's anatomy is slightly awkward in the 'tumbling' pose.
  • The golden retriever puppy appears much larger relative to the others than a newborn would be.

Verdict: Nano Banana 2 followed the prompt's specific animal list perfectly and captured a more natural sense of movement and '8K' clarity. Grok Imagine Image Pro produced a very charming image but failed on the prompt instructions by adding an extra kitten and had less realistic scaling between the animals.

Studio Ghibli Anime Style

Editing
Edit instruction

“Transform this photo into a Studio Ghibli–inspired illustration. Use soft pastel colors, hand-painted textures, gentle lighting, dreamy backgrounds, and a warm, nostalgic mood”

Source
Nano Banana 2
Grok Imagine Image Pro
100% wins 0% ties 0% wins

AI Judge Analysis

Nano Banana 2

  • + Perfectly captures the Studio Ghibli art style with clean line work and watercolor-esque textures.
  • + Maintains the composition and character poses of the original meme extremely well.
  • + Enhances the background with charming European-style details like wisteria and flower boxes typical of the requested aesthetic.
  • The man's facial expression is slightly more 'worried' than 'distracted' compared to the original.

Grok Imagine Image Pro

  • + Excellent preservation of the original characters' facial features and expressions.
  • + Provides a soft, painterly texture that aligns with the prompt.
  • The background is very blurry and lacks the detailed 'dreamy' world-building characteristic of Ghibli films.
  • The color palette feels slightly washed out compared to the vibrant but soft palette expected.

Verdict: Nano Banana 2 is the clear winner as it fully commits to the Studio Ghibli style, transforming the background into a lush, hand-painted environment while maintaining the meme's structure. Grok Imagine Image Pro applies a nice filter-like effect to the characters but fails to provide the rich background detail that defines the requested art style.

Vintage Cafe Logo

Text-to-Image

“Vintage minimalist restaurant logo for "Caffè Florian", retro cloche dome with steam and "Est. 1720" banner, classic typography, warm brown and cream tones, subtle texture on light background, vector emblem style.”

Nano Banana 2
Grok Imagine Image Pro
100% wins 0% ties 0% wins

AI Judge Analysis

Nano Banana 2

  • + Perfect text rendering of the name and 'EST. 1720'
  • + Elegant woodcut-style illustration of the cloche and steam
  • + Strongest adherence to the 'banner' and 'vintage' aesthetic tags
  • The 'À' in Caffè is slightly disconnected/stylized aggressively
  • Border details are a bit busy for a truly 'minimalist' logo

Grok Imagine Image Pro

  • + Clean vector lines following the minimalist requirement
  • + Accurate text and date rendering
  • + Clear, simple composition
  • Lacks the requested 'banner' for the establishment date
  • Steam element looks like a single generic squiggle
  • Very plain compared to the requested vintage style

Verdict: Nano Banana 2 followed the prompt's specific details much better, particularly the inclusion of the 'EST. 1720' banner and the vintage texture. While Grok Imagine Image Pro is more 'minimalist', it failed to include the requested banner and the illustration lacks the historical character implied by the prompt, whereas Nano Banana 2 produced a high-quality, cohesive emblem.

Apollo 11: Journey to Tranquility

Text-to-Image

“Create a clean, modern vector infographic poster about the Apollo 11 mission. NASA-inspired palette (navy, white, muted red, light gray). Flat-vector style, crisp lines, consistent iconography, subtle gradients only. Steps (stop at landing): 1. Launch (Saturn Vicon) 2. Earth Orbit (Earth + orbit ring icon) 3. Translunar (trajectory arc icon) 4. Lunar Orbit (Moon + orbit ring icon) 5. Descent (lunar module descending icon) 6. Landing (lunar module on the surface icon) Small supporting elements (minimal text): • Crew strip: three silhouette icons with only last names: Armstrong, Aldrin, Collins. • Landing site marker: Moon pin labeled "Tranquility" only. Layout constraints: generous margins, large readable labels, clean background with subtle stars. Vector-only, print-poster look, high resolution.”

Nano Banana 2
Grok Imagine Image Pro
0% wins 0% ties 100% wins

AI Judge Analysis

Nano Banana 2

  • + Excellent graphic design and composition, balancing icons and text effectively.
  • + Perfect adherence to the requested NASA-inspired color palette and flat vector style.
  • + Accurate text rendering for the mission names and crew members.
  • Includes a fourth unidentified crew member icon, which is historically incorrect for Apollo 11.

Grok Imagine Image Pro

  • + Clean, vertical timeline structure that is logical for a mission infographic.
  • + Correct number of crew members listed with full names.
  • + Minimalist aesthetic that follows the flat vector prompt well.
  • The 'Descent' icon includes a fiery engine plume which contradicts the 'flat vector' and 'clean icon' aesthetic compared to the others.
  • Text is slightly small and harder to read at the bottom.

Verdict: Nano Banana 2 produces a more aesthetically pleasing and professional-looking infographic with superior icon design and layout. While Grok Imagine Image Pro is more historically accurate regarding the crew count, Nano Banana 2's visual clarity and faithful adherence to the modern vector style make it the more successful image for a poster design.

Nano Banana 2

Gemini 3.1 Flash with image generation capabilities. High-efficiency image generation model with support for text rendering, reference images, search grounding, and thinking mode. The efficient counterpart to Gemini 3 Pro Image.

Grok Imagine Image Pro

xAI's premium image generation model offering higher fidelity output and stronger performance on single-image editing benchmarks compared to the standard Grok Imagine model