xAI's premium image generation model offering higher fidelity output and stronger performance on single-image editing benchmarks compared to the standard Grok Imagine model
Settled by community votes across 10 shared challenges, with an AI judge weighing in on each.
Grok Imagine Image Pro
#14 of 44 in Text-to-Image
Not enough comparable category data
The chart appears once both models have ratings across at least three shared arena categories.
Wan 2.6
#23 of 44 in Text-to-Image
Where the votes landed
Grok Imagine Image Pro
81.8%
win rate
Ties
0.0%
Wan 2.6
18.2%
win rate
Challenge by challenge
The strongest take from each model on every shared challenge, with the AI judge's read.
Modern Clean Menu
Text-to-Image“Modern minimalist restaurant menu design, white background with colorful food photos in grid, sections for appetizers/pizza/mains, bold sans-serif fonts, vibrant accents, clean professional layout for casual dining.”
AI Judge Analysis
Grok Imagine Image Pro
- + Perfect text rendering for all section headers.
- + Exceptionally clean and consistent food photography with high resolution.
- + Logical categorization where the photos match the headers (e.g., pizzas under PIZZA).
- − Lacks actual menu item text/descriptions and prices.
- − Very literal interpretation of a 'grid' creates a more modular look than a full menu sheet.
Wan 2.6
- + Includes functional menu elements like prices and item descriptions.
- + Good use of vibrant accent colors as requested in the prompt.
- + Balanced layout that looks like a printed flyer or menu sheet.
- − Text rendering is poor with many gibberish words and spelling errors.
- − Categorization is flawed; the 'Appetizers' section contains multiple pizzas.
- − Inconsistent image styles and messy borders around some photos.
Verdict: Grok Imagine Image Pro produced a much cleaner and more professional-looking design with perfect text rendering for headers and high-quality food photography. While Wan 2.6 attempted a more complex layout with prices and descriptions, the text is largely illegible and the food items are incorrectly categorized, making Grok Imagine Image Pro the superior choice for a design template.
Chalkboard Menu
Text-to-Image“Handwritten-style chalkboard menu in a cozy café, all text rendered in the exact same realistic chalk handwriting style with natural variations in letter size, slight slant, and chalk texture — no printed or digital fonts anywhere on the board. Title at the top in elegant cursive chalk handwriting: ‘TODAY’S SPECIALS – APRIL 30, 2026’. Below it, three menu items also in the same handwritten chalk style: ‘Truffle Mushroom Risotto – $24’, ‘Grilled Octopus with Lemon & Herbs – $28’, ‘Brown Butter Chocolate Chip Cookies – $9’. At the very bottom, smaller text in the identical handwritten chalk style (slightly smaller but still clearly legible with the same handwriting characteristics): ‘All items made fresh daily • Ask about our gluten-free options’. Warm ambient café lighting, visible chalk dust and smudges, realistic handwriting imperfections, no clean printed text anywhere.”
AI Judge Analysis
Grok Imagine Image Pro
- + Excellent text legibility and spelling accuracy for all menu items.
- + Consistent chalk texture across the entire layout.
- + Clean composition that follows the layout instructions precisely.
- − The 'cursive' request for the title was interpreted as a printed serif/script hybrid rather than true flowing cursive.
- − The bottom text line looks slightly more digital/artificial compared to the top sections.
Wan 2.6
- + Features a very realistic, messy chalkboard aesthetic with smudges and chalk dust.
- + Good handwriting style that feels authentic to a café setting.
- + Stronger sense of depth and environmental lighting.
- − Repetitive text error where price tags are printed twice for the first two items.
- − Minor spelling and punctuation issues in the third item and title.
- − The layout is a bit more cluttered compared to the clean layout of the prompt.
Verdict: Grok Imagine Image Pro was more successful because it followed the text and layout instructions with near-perfect accuracy, whereas Wan 2.6 struggled with logic, repeating prices twice and including typos. While Wan 2.6 had a more authentic 'messy' chalkboard texture, Grok Imagine Image Pro's clarity and adherence to the specific menu items make it the superior choice.
The Capybara Taxi Driver
Text-to-Image“Photorealistic scene inside a yellow New York taxi at night. A capybara is driving, wearing a yellow taxi driver cap and a dark jacket. It has a calm, professional expression and both front paws on the steering wheel. In the back seat sits a human businesswoman in a coat, looking at her phone with a completely normal, bored expression (as if this is just another normal ride). Through the windows you can see the streets of Manhattan at night with blurred lights. Realistic taxi interior, photorealistic, detailed fur and fabric, 35mm lens, night lighting with reflections, shallow depth of field.”
AI Judge Analysis
Grok Imagine Image Pro
- + Excellent photorealism in lighting and textures
- + Perfectly captures the 'bored' expression of the passenger
- + Interior taxi details like the meter and dashboard are very realistic
- − The capybara's paws look slightly more like human hands/claws than natural capybara anatomy
Wan 2.6
- + Atmospheric rain effect on the glass adds to the New York night aesthetic
- + Good cinematic composition from a slightly exterior angle
- − The passenger's expression looks concerned or confused rather than bored
- − The capybara's hat looks more like a police/security cap than a taxi driver cap
- − The interior perspective is slightly warped with the passenger appearing and sitting too close to the front
Verdict: Grok Imagine Image Pro followed the prompt much more accurately, specifically capturing the 'bored' expression of the passenger which is key to the surrealism of the scene. Wan 2.6 provided a nice rainy atmosphere, but the composition and expressions were less convincing than the polished, highly detailed interior of Grok.
Bald man challenge
Image Editing“Give the person a full, thick head of natural hair with realistic texture, density, and a natural hairline. Preserve facial features and lighting.”
AI Judge Analysis
Grok Imagine Image Pro
- + Perfect preservation of original facial features and identity
- + Realistic hairline and hair texture
- + Impeccable source preservation of the background and clothing
Wan 2.6
- + Adds a very 'full, thick' head of hair as requested
- + Natural hair texture and volume
- − Significantly alters facial features, especially the eyes and nose
- − Changed the frame and bridge of the glasses
- − Added hair shadows that muddy the existing facial details
Verdict: Grok Imagine Image Pro successfully added natural-looking hair while maintaining 100% of the original subject's facial identity and the background. Wan 2.6 provided more volume but completely altered the man's face and glasses, failing the preservation aspect of the editing task.
Isometric Miniature Diorama Scenes
Text-to-Image“Create a clear, 45° top-down isometric miniature 3D cartoon scene of Japan's signature dish: sushi, with soft refined textures, realistic PBR materials, gentle lighting, on a small raised diorama base with minimal garnish and plate. Solid light blue background. At top-center: 'JAPAN' in large bold text, 'SUSHI' below it, small flag icon. Perfectly centered, ultra-clean, high-clarity, square format.”
AI Judge Analysis
Grok Imagine Image Pro
- + Excellent 3D modeling with soft, refined plastic-like textures.
- + Perfect text rendering with professional layout and drop shadows.
- + Detailed variety of sushi including nigiri and rolls.
- − Perspective is slightly flatter than the requested 45-degree isometric angle.
- − The base is a simple round wooden tray rather than a geometric diorama block.
Wan 2.6
- + Strong isometric composition with a clear diorama-style raised base.
- + Clean, bold text that adheres well to the layout request.
- + High-clarity lighting and shadows that enhance the 3D effect.
- − The sushi models are slightly simpler/less detailed than Model A.
- − Large flag icon is slightly out of scale compared to the text width.
Verdict: Both models followed the prompt exceptionally well. Grok Imagine Image Pro produces more visually sophisticated sushi models with better 'PBR' material feel, while Wan 2.6 better captures the 'isometric diorama' aspect of the prompt with its square raised base and sharper perspective.
Over-the-top cartoon caricature
Editing“Create a caricature of me and my job. Make it exaggerated and humorous, incorporating my profession as a tv show anchor and my love for dogs and hockey.”
AI Judge Analysis
Grok Imagine Image Pro
- + Excellent world-building with news tickers and dog-related graphics
- + Highly exaggerated caricature style that captures the subject's features
- + Clear inclusion of all requested elements: hockey, dogs, and TV anchor role
- − The facial exaggeration is slightly unsettling compared to the source
Wan 2.6
- + Clean, appealing cartoon style
- + Good preservation of the subject's hair and eye color
- + Well-composed studio environment
- − The caricature is less exaggerated and borders more on a standard avatar style
- − Less humor and visual interest compared to the 'Pups & Pucks' theme in the other model
Verdict: Grok Imagine Image Pro interpreted the prompt much more creatively, creating a 'Breaking News' scenario titled 'Pups & Pucks' that perfectly integrates the humor and specific hobbies requested. While Wan 2.6 produced a high-quality illustration, it felt more like a generic avatar standing next to a dog, whereas Grok felt like a cohesive, humorous caricature.
Studio Ghibli Anime Style
Editing“Transform this photo into a Studio Ghibli–inspired illustration. Use soft pastel colors, hand-painted textures, gentle lighting, dreamy backgrounds, and a warm, nostalgic mood”
AI Judge Analysis
Grok Imagine Image Pro
- + Excellent structural preservation of the original meme composition.
- + Captures the Ghibli facial features accurately, especially the wide-eyed expressions.
- + Strong watercolor/hand-painted texture consistent across the image.
- − The color palette is a bit flat compared to the requested 'dreamy' lighting.
Wan 2.6
- + Beautiful use of warm, nostalgic lighting and soft pastel tones.
- + High-quality artistic textures that feel more like a finished illustration.
- + Maintains the core concept and character likeness well.
- − The added light sparkles/flecks are a bit distracting and weren't specifically requested.
- − Slightly less facial expression accuracy for the 'distracted' man compared to Model A.
Verdict: Both models did an exceptional job of preserving the source image's layout while applying the Ghibli style. Grok Imagine Image Pro excels at capturing the specific character line art and expressions associated with Studio Ghibli, whereas Wan 2.6 provides a superior atmospheric quality with better lighting and color grading that truly feels 'nostalgic' and 'dreamy.'
Golden Hour Stroll
Image Editing“Add dynamic motion to this photo: make hair blow in the wind, add leaves flying, energetic and lively feel.”
AI Judge Analysis
Grok Imagine Image Pro
- + Excellent adherence to the 'hair blowing' instruction
- + Large quantity of leaves adds significant energy and dynamic feel
- + Perfectly preserves the woman's face and the dog from the source image
- − The leaf style is a bit uniform and stylized compared to the background
Wan 2.6
- + Natural, flowing hair edit that feels very wind-blown
- + Subtle and photorealistic leaf integration
- + High fidelity preservation of the original subject
- − Fewer leaves results in a less 'energetic' feel compared to the request
- − The leaf color (bright green) clashes slightly with the darker background foliage
Verdict: Both models did an exceptional job of preserving the source image while applying the requested edits. Grok Imagine Image Pro interpreted the 'energetic and lively' instruction more effectively by adding a large volume of falling leaves, creating a clear sense of motion. Wan 2.6 provided a more subtle and natural hair flow, but the sparse leaves make the scene feel less dynamic overall.
Vintage Cafe Logo
Text-to-Image“Vintage minimalist restaurant logo for "Caffè Florian", retro cloche dome with steam and "Est. 1720" banner, classic typography, warm brown and cream tones, subtle texture on light background, vector emblem style.”
AI judge analysis unavailable for this challenge.
Apollo 11: Journey to Tranquility
Text-to-Image“Create a clean, modern vector infographic poster about the Apollo 11 mission. NASA-inspired palette (navy, white, muted red, light gray). Flat-vector style, crisp lines, consistent iconography, subtle gradients only. Steps (stop at landing): 1. Launch (Saturn Vicon) 2. Earth Orbit (Earth + orbit ring icon) 3. Translunar (trajectory arc icon) 4. Lunar Orbit (Moon + orbit ring icon) 5. Descent (lunar module descending icon) 6. Landing (lunar module on the surface icon) Small supporting elements (minimal text): • Crew strip: three silhouette icons with only last names: Armstrong, Aldrin, Collins. • Landing site marker: Moon pin labeled "Tranquility" only. Layout constraints: generous margins, large readable labels, clean background with subtle stars. Vector-only, print-poster look, high resolution.”
AI Judge Analysis
Grok Imagine Image Pro
- + Perfectly follows the 6-step infographic structure with distinct icons for each stage.
- + Accurate and legible text rendering for all mission labels and crew names.
- + Strictly adheres to the requested NASA-inspired color palette and flat-vector style.
- − One typo in 'Tranquility Base' (spelled 'Tranquility' as 'Tranquility' usually has two Ls in US English, but 'Tranquility' with one L is a less common variant) - actually 'Tranquility' has one L in American English, so this is correct.
Wan 2.6
- + Clean aesthetic with a strong navy-dominant palette.
- + Creative use of silhouettes for the crew members.
- − Completely failed to include the 6-step infographic timeline requested in the prompt.
- − Missing all requested icons (Saturn V, orbit rings, trajectory arc, lunar module).
- − Composition is mostly empty space and does not function as a poster of the mission's technical steps.
Verdict: Grok Imagine Image Pro followed the prompt instructions perfectly, creating a functional and aesthetically pleasing 6-step infographic with crisp icons and accurate text. In contrast, Wan 2.6 failed to generate any of the specific mission steps or technical icons, producing a simple title card instead of the requested infographic.
Explore each model
Alibaba's multimodal generation model from the Wan AI suite, supporting text-to-video, image-to-video, reference-to-video with audio, and text-to-image, in both Chinese and English