Head to head
Esc

Models · slot A

to navigate to pick

Grok Imagine Image Pro xAI Qwen Image 2512 Alibaba

Settled by community votes across 8 shared challenges, with an AI judge weighing in on each.

Grok Imagine Image Pro

24.8 arena score

#14 of 44 in Text-to-Image

Skill signature

Not enough comparable category data

The chart appears once both models have ratings across at least three shared arena categories.

Qwen Image 2512

22.4 arena score

#26 of 44 in Text-to-Image

Vote tally

Where the votes landed

Grok Imagine Image Pro

33.3%

win rate

Ties

16.7%

Qwen Image 2512

50.0%

win rate

33.3% 16.7% ties 50.0%
Shared challenges 8

Challenge by challenge

The strongest take from each model on every shared challenge, with the AI judge's read.

Geometric Composition

Text-to-Image

“A glass cube on a wooden table. Inside the cube is a small blue sphere. On top of the cube sits a red book. A green plant is behind the cube, partially visible through the glass. Soft window light from the left.”

Grok Imagine Image Pro
Qwen Image 2512
100% wins 0% ties 0% wins

AI Judge Analysis

Grok Imagine Image Pro

  • + Excellent depiction of thick glass with realistic optical distortion.
  • + Superior lighting consistency with clear soft window light from the left.
  • + Highly detailed textures on the wooden table and book cover.
  • The glass object is technically more of a hollow rectangular container than a perfect cube.

Qwen Image 2512

  • + Follows the 'cube' geometry more strictly than the other model.
  • + Correctly places the plant behind the object and visible through the glass.
  • + Accurate rendering of the red book's texture and placement.
  • The internal reflections of the blue sphere are spatially confusing and physically incorrect.
  • The glass appears more like thin panels or a semi-reflective mirror on the bottom rather than solid glass.

Verdict: Both models followed the prompt instructions very well, correctly placing all requested elements. Grok Imagine Image Pro stands out for its superior realism, particularly in how it handles the optical properties of thick glass and the textures of the table, whereas Qwen Image 2512 has a slightly more artificial look with confusing internal reflections.

Candid Street Photography

Text-to-Image

“A candid street photo of an elderly Japanese man repairing a red bicycle in light rain, reflections on wet pavement, shallow depth of field, 50mm lens, natural skin texture, imperfect framing, motion blur from passing cars, cinematic but realistic, no stylization.”

Grok Imagine Image Pro
Qwen Image 2512
0% wins 0% ties 100% wins

AI Judge Analysis

Grok Imagine Image Pro

  • + Excellent adherence to the 'repairing' action, showing the subject using a wrench.
  • + Superior pavement reflections and wet weather atmosphere.
  • + High visual clarity and realistic skin texture on the hands and face.
  • The bike's frame and kickstand are slightly clipping into the ground.
  • The wrench being used is floating slightly away from the actual nut.

Qwen Image 2512

  • + Strong 'candid' feel with the subject looking at the camera.
  • + Effective use of shallow depth of field and motion blur in the background cars.
  • + Good portrayal of natural skin texture and age.
  • The subject is posing with the bike rather than actively 'repairing' it as requested.
  • The bicycle's structure has significant AI artifacts, including a nonsensical double-tube frame near the seat.
  • The light rain is barely visible compared to Model A.

Verdict: Grok Imagine Image Pro is the clear winner as it accurately depicts the man actively repairing the bicycle with a tool, whereas Qwen Image 2512 shows him simply crouching behind it. Grok Imagine Image Pro also provides much better environmental details like the rain droplets and pavement puddles, while Qwen Image 2512 suffers from structural inconsistencies in the bike's frame.

Modern Clean Menu

Text-to-Image

“Modern minimalist restaurant menu design, white background with colorful food photos in grid, sections for appetizers/pizza/mains, bold sans-serif fonts, vibrant accents, clean professional layout for casual dining.”

Grok Imagine Image Pro
Qwen Image 2512
0% wins 0% ties 100% wins

AI Judge Analysis

Grok Imagine Image Pro

  • + Perfect text rendering for the requested section headers.
  • + Extremely high-quality, professional food photography with consistent lighting.
  • + Superior layout coherence that perfectly follows the requested grid for appetizers, pizza, and mains.
  • No item descriptions or pricing, though it follows the 'minimalist' prompt strictly.
  • Very literal interpretation of a grid, resulting in a square aspect ratio rather than a typical menu page.

Qwen Image 2512

  • + Includes more typical menu elements like headers, descriptions, and prices.
  • + Good use of vibrant color accents in the UI elements.
  • + Composition feels more like a physical paper menu.
  • Significant text hallway/garbled characters for the majority of the text.
  • Failed to follow the specific categories (Appetizers/Pizza/Mains) accurately, using labels like 'Appetiizizers' and 'Means'.
  • The food photos in the grid are cluttered and lack the professional clarity of Model A.

Verdict: Grok Imagine Image Pro produced a stunningly clean and professional layout with high-quality food photography and perfect text for the headers. While Qwen Image 2512 attempted a more comprehensive menu design with prices and descriptions, it suffered from severe text corruption and did not organize the categories according to the prompt instructions.

Chalkboard Menu

Text-to-Image

“Handwritten-style chalkboard menu in a cozy café, all text rendered in the exact same realistic chalk handwriting style with natural variations in letter size, slight slant, and chalk texture — no printed or digital fonts anywhere on the board. Title at the top in elegant cursive chalk handwriting: ‘TODAY’S SPECIALS – APRIL 30, 2026’. Below it, three menu items also in the same handwritten chalk style: ‘Truffle Mushroom Risotto – $24’, ‘Grilled Octopus with Lemon & Herbs – $28’, ‘Brown Butter Chocolate Chip Cookies – $9’. At the very bottom, smaller text in the identical handwritten chalk style (slightly smaller but still clearly legible with the same handwriting characteristics): ‘All items made fresh daily • Ask about our gluten-free options’. Warm ambient café lighting, visible chalk dust and smudges, realistic handwriting imperfections, no clean printed text anywhere.”

Grok Imagine Image Pro
Qwen Image 2512

AI Judge Analysis

Grok Imagine Image Pro

  • + Perfectly rendered text for every requested menu item.
  • + Exceptional chalk texture and realistic handwriting variation.
  • + Excellent adherence to the specific formatting and year.
  • The bottom footer text looks slightly more digital than the main body text.

Qwen Image 2512

  • + Very consistent and elegant cursive handwriting style.
  • + Realistic background café atmosphere with good depth of field.
  • + Great chalk smudge details on the board texture.
  • Spelling error in 'Risitto' instead of 'Risotto'.
  • The title is split into two lines instead of a single line at the top.

Verdict: Grok Imagine Image Pro stands out for its perfect text accuracy and realistic chalk texture, even correctly completing the 'Brown Butter' item which was cut off in the prompt. While Qwen Image 2512 produces a very beautiful visual and high-quality cursive, it fails on the spelling of 'Risotto' and takes more liberties with the layout.

The Capybara Taxi Driver

Text-to-Image

“Photorealistic scene inside a yellow New York taxi at night. A capybara is driving, wearing a yellow taxi driver cap and a dark jacket. It has a calm, professional expression and both front paws on the steering wheel. In the back seat sits a human businesswoman in a coat, looking at her phone with a completely normal, bored expression (as if this is just another normal ride). Through the windows you can see the streets of Manhattan at night with blurred lights. Realistic taxi interior, photorealistic, detailed fur and fabric, 35mm lens, night lighting with reflections, shallow depth of field.”

Grok Imagine Image Pro
Qwen Image 2512

AI Judge Analysis

Grok Imagine Image Pro

  • + Excellent photographic quality with realistic depth of field and lighting.
  • + The woman's expression and interaction with the phone are very natural.
  • + Accurate and detailed taxi interior, including a functional-looking taxi meter.
  • The capybaras front paws are slightly distorted and appear more like human-monkey hybrids.

Qwen Image 2512

  • + Strong prompt adherence regarding the capybara's expression and position of the front paws.
  • + The capybara's hat is more reminiscent of a traditional chauffeur/taxi driver cap.
  • The woman in the background has a distorted, grumpy expression rather than a bored/neutral one.
  • Visual artifacts present on the woman's hands and the phone.
  • The background lighting is less realistic and appears more muddy than Model A.

Verdict: Grok Imagine Image Pro produces a much higher quality, photorealistic image with better lighting and a more convincing 'normal' expression for the passenger. While Qwen Image 2512 followed some specific capybara details well, it suffered from notable distortion on the human subject and lower overall image clarity.

Isometric Miniature Diorama Scenes

Text-to-Image

“Create a clear, 45° top-down isometric miniature 3D cartoon scene of Japan's signature dish: sushi, with soft refined textures, realistic PBR materials, gentle lighting, on a small raised diorama base with minimal garnish and plate. Solid light blue background. At top-center: 'JAPAN' in large bold text, 'SUSHI' below it, small flag icon. Perfectly centered, ultra-clean, high-clarity, square format.”

Grok Imagine Image Pro
Qwen Image 2512

AI Judge Analysis

Grok Imagine Image Pro

  • + Excellent text rendering with clean, professional-looking typography and drop shadows.
  • + Very clean, minimalist aesthetic that perfectly matches the 'solid light blue background' request.
  • + High-quality PBR material rendering on the wood and ceramic plate.
  • The 'raised diorama base' is interpreted as a simple wooden tray rather than a themed environment.
  • The rice texture looks slightly like plastic beads rather than organic sushi rice.

Qwen Image 2512

  • + Captures the 'isometric miniature diorama' feel perfectly with the square base and small landscaping details.
  • + The 3D cartoon style is very cohesive between the food and the typography.
  • + Good use of color and lighting to create a tactile, toy-like appearance.
  • The text 'JAPAN' has a slight artifact/overlap on the letter 'N'.
  • The background has a slight gradient shadow at the bottom, making it less of a 'solid' background compared to image A.

Verdict: Grok Imagine Image Pro produces a cleaner, more professional graphic design suitable for a menu or advertisement, featuring superior text clarity and realistic textures. However, Qwen Image 2512 better understands the 'diorama' part of the prompt, creating a charming miniature world on a raised square base. Both models followed the complex text instructions perfectly, but Grok Imagine Image Pro is the winner due to higher overall image clarity and more refined PBR materials.

Adorable Baby Animals in Sunny Meadow

Text-to-Image

“Hyper-photorealistic scene of fluffy baby animals—a golden retriever puppy, tabby kitten, baby bunny, and red fox kit—with big expressive eyes and ultra-detailed soft fur, playfully chasing butterflies and tumbling together in a lush wildflower meadow, warm golden sunrise light with god rays and dew sparkles, joyful wholesome vibe, 8K masterpiece.”

Grok Imagine Image Pro
Qwen Image 2512
0% wins 100% ties 0% wins

AI Judge Analysis

Grok Imagine Image Pro

  • + Captures the active 'chasing' and 'tumbling' motion requested in the prompt.
  • + Successfully includes all requested animals clearly: golden retriever, tabby kitten(s), bunny, and fox.
  • + Excellent rendition of god rays and the golden hour atmosphere.
  • Includes two kittens instead of the implied single kitten, though this adds to the scene.
  • The fox's anatomy and pose are a bit awkward with the paws in the air.

Qwen Image 2512

  • + Beautiful character portraits with highly expressive eyes and soft fur texture.
  • + Very clean composition with no major anatomical errors.
  • + The golden sunrise light is vibrantly rendered across the animal's fur.
  • The animals are posing for a static portrait rather than 'chasing' or 'tumbling' as requested.
  • The butterflies are static and feel tacked on rather than part of an active chase.

Verdict: Grok Imagine Image Pro followed the prompt's action requirements much better, depicting a dynamic scene of animals playing and tumbling in a misty meadow. While Qwen Image 2512 produced a cleaner and more endearing 'family portrait' style image, it failed to capture the sense of movement and playfulness requested by 'chasing' and 'tumbling'.

Vintage Cafe Logo

Text-to-Image

“Vintage minimalist restaurant logo for "Caffè Florian", retro cloche dome with steam and "Est. 1720" banner, classic typography, warm brown and cream tones, subtle texture on light background, vector emblem style.”

Grok Imagine Image Pro
Qwen Image 2512

AI Judge Analysis

Grok Imagine Image Pro

  • + Clean, minimalist vector emblem style that looks like a modern-vintage logo.
  • + Accurate typography for both 'Caffè Florian' and the foundation date.
  • + Good use of subtle paper texture on the background.
  • The silver/grey cloche clashing with the 'warm brown and cream' color palette requested.
  • The 'steam' is overly simplified and looks like a single generic swirl.
  • The 'banner' for the date is just a geometric block rather than a classic ribbon or scroll banner.

Qwen Image 2512

  • + Excellent adherence to the 'warm brown and cream' color palette.
  • + Beautiful vintage illustration style with cross-hatching and a classic banner scroll.
  • + Dynamic and detailed steam effects that integrate well with the cloche.
  • Slightly less 'minimalist' than Model A, leaning more towards a detailed illustration.
  • The text 'Caffè Florian' has a slightly inconsistent baseline.

Verdict: Qwen Image 2512 followed the color palette and stylistic prompts much better than Grok Imagine Image Pro, producing a sophisticated vintage logo with a classic banner and warm tones. While Grok Imagine Image Pro was more 'minimalist', it failed to use the requested brown/cream tones for the central icon and used a very basic shape instead of a banner.

Next steps

Explore each model