Nano Banana vs Grok Imagine Image Pro
Head-to-head across 16 challenges
Nano Banana
41.2%
win rate
Ties
5.9%
Grok Imagine Image Pro
52.9%
win rate
Challenge Results
Geometric Composition
Text-to-Image“A glass cube on a wooden table. Inside the cube is a small blue sphere. On top of the cube sits a red book. A green plant is behind the cube, partially visible through the glass. Soft window light from the left.”
AI Judge Analysis
Nano Banana
- + Expertly handles the 'partially visible through the glass' instruction with realistic refraction.
- + Superior lighting and atmosphere with dust motes and soft directional window light.
- + Very high visual quality with a photorealistic, shallow depth-of-field look.
- − The blue sphere is floating inexplicably in the center of the cube, which might seem physically improbable.
- − The book texture is slightly repetitive compared to a real book.
Grok Imagine Image Pro
- + The placement of the sphere at the bottom of the cube follows natural gravity.
- + Clear adherence to all spatial requirements of the prompt.
- + The wood texture of the table is very detailed and realistic.
- − The glass cube looks more like an open glass box or container rather than a solid or sealed cube.
- − The plant is behind the cube but doesn't show much distortion or 'visibility through the glass' as requested.
- − The lighting is somewhat flat compared to the requested soft window light.
Verdict: Nano Banana wins because it brilliantly captures the complex prompt requirement of the plant being partially visible through the glass with realistic refraction and beautiful atmospheric lighting. While Grok Imagine Image Pro follows the spatial instructions well, its glass rendering is less convincing and the lighting lacks the depth and mood found in the first image.
Chalkboard Menu
Text-to-Image“Handwritten-style chalkboard menu in a cozy café, all text rendered in the exact same realistic chalk handwriting style with natural variations in letter size, slight slant, and chalk texture — no printed or digital fonts anywhere on the board. Title at the top in elegant cursive chalk handwriting: ‘TODAY’S SPECIALS – APRIL 30, 2026’. Below it, three menu items also in the same handwritten chalk style: ‘Truffle Mushroom Risotto – $24’, ‘Grilled Octopus with Lemon & Herbs – $28’, ‘Brown Butter Chocolate Chip Cookies – $9’. At the very bottom, smaller text in the identical handwritten chalk style (slightly smaller but still clearly legible with the same handwriting characteristics): ‘All items made fresh daily • Ask about our gluten-free options’. Warm ambient café lighting, visible chalk dust and smudges, realistic handwriting imperfections, no clean printed text anywhere.”
AI Judge Analysis
Nano Banana
- + Excellent atmosphere and depth of field in the café background.
- + Highly realistic chalk dust and smudge textures on the board surface.
- + Very legible and elegant cursive handwriting that fits the prompt.
- − The cursive font looks slightly digital/consistent despite the prompt's request for natural variation.
- − The bottom text line is very small and difficult to read.
Grok Imagine Image Pro
- + Perfect text accuracy, including completing the 'Brown Butter' item correctly.
- + Exceptional chalk texture on the strokes, showing the grain of the chalk.
- + Text follows the request for slight slant and natural variations in size.
- − The 'TODAY'S SPECIALS' title is in a blocky print style rather than the requested elegant cursive.
- − The background is less detailed and lacks the cozy café atmosphere seen in the other image.
Verdict: Grok Imagine Image Pro wins on prompt adherence by correctly completing the truncated menu item 'Brown Butter Chocolate Chip Cookies' and providing a more convincing hand-drawn chalk texture. Nano Banana creates a more visually stunning and atmospheric 'cozy café' scene, but the lettering feels more like a digital font overlay compared to the authentic chalk feel of Grok Imagine Image Pro.
Pose & Character Mashup
Editing“Use Image 1 as the exact pose reference and Image 2 as the character reference. Recreate the person/character from Image 2 in the exact dynamic pose and body position from Image 1. Keep the exact face, hair, clothing style/details, and expression from Image 2. Match the lighting and environment of Image 1. The final image must show the character from Image 2 performing the precise action/pose from Image 1 with perfect anatomy and natural integration.”
AI Judge Analysis
Nano Banana
- + Successfully replicates the clothing and accessories from image 2 including the black sweatshirt, scarf, and sunglasses.
- + Maintains the exact pose and environment from image 1.
- + Invisibly blends the character elements with the source lighting.
- − Failed to swap the gender/character identity, keeping the female subject's face, hair, and legs from Image 1.
- − The sunglasses are poorly distorted/warped onto the face.
Grok Imagine Image Pro
- + Perfectly maintains the background, pose, and red hoodie from Image 1.
- + Good anatomical consistency with the source image.
- − Completely ignored the request to use Image 2 as a character reference.
- − Did not change the face, hairstyle, or clothing to match the person in Image 2.
Verdict: Both models failed significantly on the core task of character replacement. Nano Banana attempted to incorporate the clothing and accessories from Image 2 (scarf, sunglasses, black shirt) onto the body in Image 1, but failed to replace the actual person. Grok Imagine Image Pro essentially provided a slightly higher quality recreation of Image 1 while ignoring the instructions to use Image 2's character details entirely. Nano Banana is the winner for at least attempting to follow the image-to-image style and accessory transfer.
Outfit Transfer Challenge
Editing“Use Image 1 as the base person. Dress them in the exact elaborate outfit from Image 2 (including all layers, accessories, jewelry, and shoes). Carefully adapt the clothing to the body shape and pose in Image 1 while maintaining realistic fabric behavior, correct proportions, and perfect lighting/shadow matching. Keep the person’s exact face, hair, and background completely unchanged.”
AI Judge Analysis
Nano Banana
- + Perfectly followed the instruction to transfer the exact outfit from Image 2.
- + Flawlessly preserved the original person's face, hair, and the background.
- + Integrated the clothing with realistic lighting and correct proportions relative to the source person.
- − Slightly smoothed the skin texture on the man's neck where the collar begins.
Grok Imagine Image Pro
- + Preserved the original face and background accurately.
- − Completely failed to use the outfit from Image 2, instead generating an unrelated royal costume.
- − Mismatched the skin tone/texture of the hands, which do not match the subject's vitiligo or natural tone.
- − The jewelry and belt details contain some AI-typical warping.
Verdict: Nano Banana followed the instructions with high precision, successfully transferring the specific navy coat, plaid scarf, jeans, and accessories from Image 2 onto the target person while maintaining the integrity of the original photo. In contrast, Grok Imagine Image Pro completely ignored the reference image for the clothing, substituting it with a generic 'elaborate' royal outfit and failed to maintain consistent skin characteristics for the hands. Nano Banana is the clear winner for its superior prompt adherence and technical execution.
Modern Clean Menu
Text-to-Image“Modern minimalist restaurant menu design, white background with colorful food photos in grid, sections for appetizers/pizza/mains, bold sans-serif fonts, vibrant accents, clean professional layout for casual dining.”
AI Judge Analysis
Nano Banana
- + Successfully includes text descriptions and prices for menu items.
- + Features a clean, centered layout with vibrant colored borders for photo frames.
- + Includes a variety of food items that match the requested categories.
- − Contains significant spelling errors like 'APPEITIERS', 'Brusuechta', and 'Margheitte'.
- − The food images are quite small relative to the overall page size.
- − The text becomes nonsensical in descriptions, such as 'sanilla bean ice cream' for a 'NY Strip' steak.
Grok Imagine Image Pro
- + Excellent photographic quality with consistent lighting and high resolution.
- + Perfectly organized grid layout that aligns with the requested sections (Appetizers, Pizza, Mains).
- + Accurate spelling for all section headers and clean, bold sans-serif typography.
- − Missing item names and prices which were implied by 'restaurant menu design'.
- − The white background is very stark with no secondary accents or footer elements.
Verdict: Grok Imagine Image Pro produces a much more professional and visually appealing layout, with high-quality photography and perfect spelling of section headers. Nano Banana attempts a more complete menu with descriptions and prices, but fails significantly due to numerous typos and nonsensical text pairings. Grok's clean execution of the grid and food imagery makes it the superior design choice for a professional context.
The Capybara Taxi Driver
Text-to-Image“Photorealistic scene inside a yellow New York taxi at night. A capybara is driving, wearing a yellow taxi driver cap and a dark jacket. It has a calm, professional expression and both front paws on the steering wheel. In the back seat sits a human businesswoman in a coat, looking at her phone with a completely normal, bored expression (as if this is just another normal ride). Through the windows you can see the streets of Manhattan at night with blurred lights. Realistic taxi interior, photorealistic, detailed fur and fabric, 35mm lens, night lighting with reflections, shallow depth of field.”
AI Judge Analysis
Nano Banana
- + Excellent cinematic lighting and texture on the capybara fur.
- + The side-profile composition feels very natural for a street photography style.
- + Captures the bored, disinterested expression of the passenger perfectly.
- − The passenger is clearly in the front seat next to the driver, failing the 'back seat' part of the prompt.
- − The steering wheel appears to be merging into the dashboard at an odd angle.
Grok Imagine Image Pro
- + Highly detailed text rendering on the taxi driver cap.
- + The capybara's forward-facing position and paws on the wheel are very clear.
- + Correct inclusion of a trench coat on the businesswoman sitting in the back.
- − The passenger appears to be sitting in the front passenger seat rather than the back seat as requested.
- − The interior layout of the taxi feels slightly cramped and distorted.
Verdict: Both models failed to place the passenger in the back seat, instead placing her in the front passenger seat. Grok Imagine Image Pro wins because of the superior detail in the capybara's wardrobe (specifically the legible text on the hat) and the more literal interpretation of the 'professional expression' and steering wheel placement, whereas Nano Banana has slightly more realistic photographic lighting but messy geometry on the car controls.
Candid Street Photography
Text-to-Image“A candid street photo of an elderly Japanese man repairing a red bicycle in light rain, reflections on wet pavement, shallow depth of field, 50mm lens, natural skin texture, imperfect framing, motion blur from passing cars, cinematic but realistic, no stylization.”
AI Judge Analysis
Nano Banana
- + Excellent cinematic lighting and atmosphere with a strong wet-pavement aesthetic.
- + Very high level of detail in the character's facial textures and clothing.
- + Bicycle mechanics and the newspaper/tool spread add a great sense of realism and narrative depth.
- − The bicycle components are slightly warped, particularly the alignment of the handlebars and the pedal placement.
- − The lighting feels a bit more 'curated' than the 'no stylization' request, though still realistic.
Grok Imagine Image Pro
- + Captures the 'candid' and 'imperfect framing' prompt effectively with a more documentary-style feel.
- + Natural looking motion blur on the background vehicles.
- + Good adherence to the 50mm shallow depth of field request.
- − There are significant anatomical issues with the man's hands handling the wrench.
- − The bicycle frame has structural inconsistencies where the rear stay connects to the seat post.
- − Overall image clarity is lower than Model A, appearing slightly soft.
Verdict: Nano Banana creates a more visually compelling and detailed image with superior atmosphere and skin textures, though it feels slightly more stylized than a raw candid. Grok Imagine Image Pro captures the documentary 'candid' framing better, but suffers from significant anatomical errors in the hands and technical inconsistencies in the bike's structure. Nano Banana is the winner due to its significantly higher overall image quality and better execution of the fine details.
Fantasy Warrior
Text-to-Image“Close portrait of a battle-worn paladin in ornate engraved plate armor, hair braided with small beads, faint scars and dirt on the skin, warm torchlight reflecting off metal, shallow depth of field, bokeh sparks, lifelike eyes, highly detailed texture on leather straps and cloth underlayer.”
AI Judge Analysis
Nano Banana
- + Excellent rendition of a 'battle-worn' appearance with realistic facial aging and weathered skin.
- + Highly ornate armor engravings with a beautiful eagle motif and realistic metal patina.
- + Very clean background bokeh with distinct torch flames and soft sparks.
- − Missing the specific 'small beads' in the hair braids requested in the prompt.
- − Side-profile gaze makes it feel more like a posed character portrait than a cinematic close-up.
Grok Imagine Image Pro
- + Perfect adherence to the 'small beads' prompt, including decorative bone-like beads throughout the hair.
- + Impressive text rendering on the gorget ('Lux in tenebris') which adds to the paladin theme.
- + Intense, lifelike eyes and more noticeable dirt/smudges on the face per the prompt.
- − The sparks in the air look a bit uniform and digitally added compared to the softer bokeh in Model A.
- − The facial structure is very 'modern model' aesthetic, which slightly clashes with the 'battle-worn' description.
Verdict: Grok Imagine Image Pro is the winner for its superior prompt adherence, specifically incorporating the requested beads in the hair and adding appropriate thematic text to the armor. While Nano Banana captures a more convincing 'aged' warrior look, Grok Imagine Image Pro provides a more detailed and accurate interpretation of the specific textures and lighting requested.
Bald man challenge
Image Editing“Give the person a full, thick head of natural hair with realistic texture, density, and a natural hairline. Preserve facial features and lighting.”
AI Judge Analysis
Nano Banana
- + Natural hair texture and realistic messy volume.
- + Excellent preservation of the facial features and original lighting.
- + Seamless integration of the hair with the forehead and background.
- − The glasses frame on the left side (viewer's left) becomes slightly distorted/blended into the hair.
Grok Imagine Image Pro
- + Very clean and realistic hairline.
- + Maintains the structural integrity of the glasses better than Model A.
- + Perfectly preserves the original background and lighting.
- − The hairstyle is slightly more 'styled' and less 'thick/full' than requested compared to Model A.
Verdict: Both models performed exceptionally well, maintaining the original identity, lighting, and background of the source image. Nano Banana provided a thicker, more natural-looking head of hair that feels more organic to the character's rugged aesthetic, though it slightly muddied the glasses frame; Grok Imagine Image Pro provided a cleaner, more groomed look with a very precise hairline that perfectly preserved all accessories.
Isometric Miniature Diorama Scenes
Text-to-Image“Create a clear, 45° top-down isometric miniature 3D cartoon scene of Japan's signature dish: sushi, with soft refined textures, realistic PBR materials, gentle lighting, on a small raised diorama base with minimal garnish and plate. Solid light blue background. At top-center: 'JAPAN' in large bold text, 'SUSHI' below it, small flag icon. Perfectly centered, ultra-clean, high-clarity, square format.”
AI Judge Analysis
Nano Banana
- + Perfectly captures the requested isometric 3D perspective.
- + Clean and bold text rendering that matches the hierarchy requested.
- + High-quality 3D diorama aesthetic with consistent lighting and materials.
- − The flag icon is placed to the left of the text rather than at the top-center as requested.
Grok Imagine Image Pro
- + Excellent material textures, especially on the salmon and wooden base.
- + Highly detailed food models with realistic PBR-like sheen.
- + Accurate placement of the text and flag according to the layout prompt.
- − The perspective is more of a standard high-angle shot rather than a true 45° isometric view.
- − The white text on a light blue background has lower legibility compared to Model A.
Verdict: Nano Banana followed the architectural style of the prompt better, delivering a true 45° isometric diorama that looks like a high-end 3D render. Grok Imagine Image Pro produced more realistic textures and food details, but failed the geometric 'isometric' requirement of the perspective. Nano Banana's bold black text is also much more readable against the solid blue background.
Over-the-top cartoon caricature
Editing“Create a caricature of me and my job. Make it exaggerated and humorous, incorporating my profession as a tv show anchor and my love for dogs and hockey.”
AI Judge Analysis
Nano Banana
- + Excellent preservation of the subject's facial features and eye shape.
- + Sophisticated caricature style with clever details like the 'Breaking: Dog Scores Goal' text.
- + Clean, high-quality illustration style with consistent lighting and shadows.
- − The addition of a hyper-muscular arm feels a bit random and unrelated to the specific prompt instructions.
Grok Imagine Image Pro
- + Very humorous and literal interpretation of all prompt elements (hockey cup, stick, multiple dogs).
- + Strong, readable text elements that add to the 'TV anchor' theme.
- + Exaggerated facial expressions that fit the caricature brief well.
- − The subject's facial likeness is significantly less accurate compared to the source image than Model A.
- − Some visual clutter makes the composition feel a bit crowded.
Verdict: Nano Banana maintains an impressive likeness to the original subject while translating her into a stylized caricature. Grok Imagine Image Pro goes further with the 'humorous' and 'exaggerated' aspects of the prompt by including more props and dogs, but the facial recognition is lost in the process. Nano Banana is the preferred output for its superior balance of artistic transformation and source preservation.
Adorable Baby Animals in Sunny Meadow
Text-to-Image“Hyper-photorealistic scene of fluffy baby animals—a golden retriever puppy, tabby kitten, baby bunny, and red fox kit—with big expressive eyes and ultra-detailed soft fur, playfully chasing butterflies and tumbling together in a lush wildflower meadow, warm golden sunrise light with god rays and dew sparkles, joyful wholesome vibe, 8K masterpiece.”
AI Judge Analysis
Nano Banana
- + Excellent adherence to the count and species of animals requested.
- + Superior rendering of eye expressions and soft fur textures.
- + Dreamy, magical lighting with subtle 'god rays' that enhance the wholesome vibe.
- − The composition is a bit crowded with the animals very close together in the center.
- − The butterfly on the puppy's ear looks somewhat pasted on.
Grok Imagine Image Pro
- + Better sense of action and 'tumbling' as requested in the prompt.
- + Includes more varied floral elements and a wider landscape view.
- + Stronger 'god rays' and morning atmosphere.
- − Failed the animal count by including two kittens instead of one.
- − The puppy's anatomy looks slightly awkward in its leaping pose.
- − Some butterflies appear to have disjointed wing structures.
Verdict: Nano Banana followed the prompt instructions more accurately by providing the specific set of four animals requested, whereas Grok Imagine Image Pro added an extra kitten. While Grok Imagine Image Pro captured the 'tumbling' and 'chasing' action more dynamically, Nano Banana's superior character rendering and adherence to the prompt make it the better overall image.
Studio Ghibli Anime Style
Editing“Transform this photo into a Studio Ghibli–inspired illustration. Use soft pastel colors, hand-painted textures, gentle lighting, dreamy backgrounds, and a warm, nostalgic mood”
AI Judge Analysis
Nano Banana
- + Excellent source preservation of the original subjects
- − Failed the primary instruction to transform the image into an illustration
- − Retains a purely photographic look with minimal filters
- − Lacks the requested pastel colors and hand-painted textures
Grok Imagine Image Pro
- + Perfectly captures the Studio Ghibli illustration style
- + Excellent use of hand-painted watercolor textures and soft pastel colors
- + Maintains the composition and character expressions of the source image while stylizing them
- − Minor loss of detail in some background areas
Verdict: Nano Banana failed to perform the requested edit, providing what looks like the original photo with very minor color adjustments. Grok Imagine Image Pro successfully transformed the meme into a beautiful Studio Ghibli-style watercolor illustration while perfectly maintaining the recognizable layout and character roles.
Golden Hour Stroll
Image Editing“Add dynamic motion to this photo: make hair blow in the wind, add leaves flying, energetic and lively feel.”
AI Judge Analysis
Nano Banana
- + Successfully added wind movement to the hair.
- + Provided high-quality, autumnal leaves with varying detail and motion blur.
- + Preserved the woman's face and the dog's appearance perfectly.
- − One leaf is awkwardly layered or merged into the leash.
Grok Imagine Image Pro
- + Added a significant amount of leaves to create a more energetic feeling.
- + Achieved a natural wind-blown effect on the hair.
- + Maintained excellent source preservation of the subject and background.
- − Some leaves exhibit minor artifacts or appear as flat yellow shapes without detail.
- − The density of the leaves feels a bit artificial compared to the surrounding greenery.
Verdict: Both models followed the instructions very well, effectively adding wind-blown hair and flying leaves while preserving the original person and dog. Nano Banana produced more realistic-looking leaves with better textures, whereas Grok Imagine Image Pro provided a higher volume of leaves that arguably fits the 'energetic and lively' prompt better, though some of its leaves look like simple clip-art shapes.
Vintage Cafe Logo
Text-to-Image“Vintage minimalist restaurant logo for "Caffè Florian", retro cloche dome with steam and "Est. 1720" banner, classic typography, warm brown and cream tones, subtle texture on light background, vector emblem style.”
AI Judge Analysis
Nano Banana
- + Excellent adherence to the 'warm brown and cream' color scheme.
- + Superior typography with graceful curves and correct accent usage.
- + The cloche and banner design feels more cohesive and 'vintage' in style.
- − Small artifacts in the steam lines where they meet the cloche dome.
Grok Imagine Image Pro
- + Clean, high-contrast vector lines.
- + Very legible 'EST. 1720' text.
- + Effective subtle texture on the background.
- − The cloche is grey, which violates the requested 'warm brown and cream' palette.
- − The typography for 'CAFFÈ' has a slightly awkward, blocky spacing.
- − The 'banner' for the date is just a solid block of color rather than a traditional banner style.
Verdict: Nano Banana is the clear winner as it perfectly captures the requested 'warm brown' color palette and vintage aesthetic, whereas Grok Imagine Image Pro introduced grey tones that were not requested. Nano Banana's typography is more elegant and its representation of the 'cloche dome with steam' and 'banner' feels much more like a professional restaurant logo.
Apollo 11: Journey to Tranquility
Text-to-Image“Create a clean, modern vector infographic poster about the Apollo 11 mission. NASA-inspired palette (navy, white, muted red, light gray). Flat-vector style, crisp lines, consistent iconography, subtle gradients only. Steps (stop at landing): 1. Launch (Saturn Vicon) 2. Earth Orbit (Earth + orbit ring icon) 3. Translunar (trajectory arc icon) 4. Lunar Orbit (Moon + orbit ring icon) 5. Descent (lunar module descending icon) 6. Landing (lunar module on the surface icon) Small supporting elements (minimal text): • Crew strip: three silhouette icons with only last names: Armstrong, Aldrin, Collins. • Landing site marker: Moon pin labeled "Tranquility" only. Layout constraints: generous margins, large readable labels, clean background with subtle stars. Vector-only, print-poster look, high resolution.”
AI Judge Analysis
Nano Banana
- + Clean, professional vector aesthetic with consistent line weights.
- + Follows the color palette closely with a deep navy background.
- + Very readable text and clear iconography for the steps.
- − The iconography for Descent and Lunar Orbit is identical.
- − The composition has large empty spaces in the upper quadrant.
Grok Imagine Image Pro
- + Accurately renders different icons for every single requested step.
- + Excellent character headshots for the crew with correct names.
- + Better vertical composition that leads the eye through the timeline.
- − Small details like the flag on the landing module look a bit messy.
- − Background is a simple gray rather than the navy requested for the primary feeling.
Verdict: Grok Imagine Image Pro followed the prompt instructions more accurately by providing unique icons for every step, including a distinct lunar module for the descent and landing phases, whereas Nano Banana repeated the same icon for stages 4 and 5. Grok Imagine Image Pro also exhibited better text rendering for the crew names, though Nano Banana had a sleeker, more professional graphic design feel.
Nano Banana
Gemini 2.5 Flash Image is optimized for image understanding and generation, offering a balance of price and performance with fast and efficient image generation and editing capabilities.
Grok Imagine Image Pro
xAI's premium image generation model offering higher fidelity output and stronger performance on single-image editing benchmarks compared to the standard Grok Imagine model