Head to head
Esc

Models · slot A

to navigate to pick

GPT Image 2 OpenAI Wan 2.6 Alibaba

Settled by community votes across 14 shared challenges, with an AI judge weighing in on each.

GPT Image 2

28.4 arena score

#2 of 48 in Text-to-Image

Top 2 in Text-to-Image Top 3 in Image Editing
Skill signature · Text-to-Image

Wan 2.6

23.1 arena score

#24 of 48 in Text-to-Image

Top 2 in Image-to-Video
Vote tally

Where the votes landed

GPT Image 2

0%

win rate

Ties

0%

Wan 2.6

0%

win rate

Shared challenges 14

Challenge by challenge

The strongest take from each model on every shared challenge, with the AI judge's read.

Geometric Composition

Text-to-Image

“A glass cube on a wooden table. Inside the cube is a small blue sphere. On top of the cube sits a red book. A green plant is behind the cube, partially visible through the glass. Soft window light from the left.”

GPT Image 2
Wan 2.6

AI Judge Analysis

GPT Image 2

  • + Excellent photographic quality with very clean glass reflections
  • + High-detail texture on the red book cover
  • + Strong adherence to the 'soft window light' requirement
  • The plant appears more next to/behind the cube than partially visible 'through' it as requested

Wan 2.6

  • + Successfully shows the plant distorted and visible through the glass cube
  • + Realistic weathered texture on the red book
  • + Strong lighting interaction with reflections and shadows on the table
  • The glass cube has some structural inconsistencies in its thickness and geometry
  • The blue sphere has a slightly unnatural glow/internal reflection

Verdict: Both models followed the prompt instructions perfectly, including the specific spatial arrangement of objects. GPT Image 2 opted for a cleaner, modern aesthetic with sharp focus and perfect geometry, while Wan 2.6 provided a more naturalistic, 'lived-in' look with better refraction of the plant through the glass. GPT Image 2 is the likely winner for its superior technical clarity and perfect execution of the glass material.

Candid Street Photography

Text-to-Image

“A candid street photo of an elderly Japanese man repairing a red bicycle in light rain, reflections on wet pavement, shallow depth of field, 50mm lens, natural skin texture, imperfect framing, motion blur from passing cars, cinematic but realistic, no stylization.”

GPT Image 2
Wan 2.6

AI Judge Analysis

GPT Image 2

  • + Excellent photographic realism with very natural skin textures
  • + Perfectly captures the 'imperfect framing' and 'candid' street photography style requested
  • + Very high technical clarity on foreground objects like the toolbox and bike components
  • The rain is barely visible, appearing more like a damp day than light rain
  • Motion blur on the passing vehicle is present but feels a bit static compared to Model B

Wan 2.6

  • + Atmospheric depiction of rain with visible droplets on clothing and pavement
  • + Stronger cinematic feel with pronounced bokeh and neon reflections
  • + Better representation of motion blur from the passing car
  • Significant anatomy issues with multiple instances of extra fingers and mangled hand structure
  • The rain droplets on the jacket look like static glass beads rather than water
  • Slightly more stylized and lacks the raw 'candid' feel of Model A

Verdict: GPT Image 2 is the superior choice because it achieves a high level of realism and captures the specific 'candid' and 'imperfect framing' photography style requested without the severe anatomical errors found in Wan 2.6. While Wan 2.6 succeeds in creating a more dramatic rainy atmosphere, the failure to render the man's hands correctly breaks the immersion of an otherwise beautiful image.

Fantasy Warrior

Text-to-Image

“Close portrait of a battle-worn paladin in ornate engraved plate armor, hair braided with small beads, faint scars and dirt on the skin, warm torchlight reflecting off metal, shallow depth of field, bokeh sparks, lifelike eyes, highly detailed texture on leather straps and cloth underlayer.”

GPT Image 2
Wan 2.6

AI Judge Analysis

GPT Image 2

  • + Excellent high-frequency texture on the skin and etched metal surfaces
  • + Natural blending of the braided hair with the head and shoulders
  • + Subtle and realistic application of dirt and faint scars
  • Lighting is a bit flat compared to the requested warm torchlight effect
  • Beads in hair are very small and less distinct than requested

Wan 2.6

  • + Stronger adherence to the lighting prompt with warm highlights and bokeh sparks
  • + Excellent rendering of large, distinct beads and leather straps
  • + Highly weathered and battle-worn feel with visible mud and scars
  • Some messy artifacts where the braid connects to the armor
  • Lower facial realism compared to the skin texture in Model A

Verdict: GPT Image 2 provides a more grounded and anatomically convincing portrait with superior skin and metal textures. However, Wan 2.6 captures the requested atmosphere better, including the specific details of the beads, leather straps, and warm torchlight ambiance that were central to the prompt. While GPT Image 2 is aesthetically cleaner, Wan 2.6 is the more successful interpretation of the prompt's specific descriptive elements.

Modern Clean Menu

Text-to-Image

“Modern minimalist restaurant menu design, white background with colorful food photos in grid, sections for appetizers/pizza/mains, bold sans-serif fonts, vibrant accents, clean professional layout for casual dining.”

GPT Image 2
Wan 2.6

AI Judge Analysis

GPT Image 2

  • + Excellent text rendering with clear, legible, and relevant menu items.
  • + Perfect execution of the requested grid layout with distinct sections for appetizers, pizza, and mains.
  • + Professional and realistic food photography that matches the descriptions.
  • The layout is quite dense, which slightly pushes the limits of 'minimalist'.

Wan 2.6

  • + Good use of vibrant accent colors in the border design as requested.
  • + Clean white background consistency throughout the layout.
  • Text is largely unintelligible gibberish with severe artifacts.
  • The food photos are repetitive and do not match the section headers (e.g., pizzas under the 'Appetizers' heading).
  • Poor section organization and inconsistent alignment of text elements.

Verdict: GPT Image 2 is significantly superior as it produces a fully functional, professional-grade menu with readable text, logical categorization, and high-quality photography. Wan 2.6 fails on basic legibility and logical consistency, placing pizza images under appropriate headers and displaying garbled text.

Magic Burger Explosion: Fiery Photorealism Challenge

Text-to-Image

“Ad for 'Magic Burger'. Dynamic, exploded burger with all components (bun, patty, cheese, lettuce, tomato, sauce) suspended in mid-air. Emphasize photorealistic detail and a sense of motion. Dark, fiery background with glowing embers. Integrate text: 'MAGIC BURGER' as a prominent title, 'LIMITED TIME ONLY' as a secondary message, and '€6.99' in a starburst, all rendered with a fiery, glowing effect.”

GPT Image 2
Wan 2.6

AI Judge Analysis

GPT Image 2

  • + Excellent text rendering with impressive fiery glow effects.
  • + Superior photorealistic detail in food textures like the patty and sesame seeds.
  • + Perfect adherence to the starburst and layout requirements.
  • The composition is a bit crowded on the left side due to text size.

Wan 2.6

  • + Atmospheric use of smoke and embers in the background.
  • + Good sense of vertical motion and depth in the burger explosion.
  • Failed to render the price text within a fiery glowing effect, opting for a flat graphic look.
  • The 'LIMITED TIME ONLY' text is placed at the very bottom and lacks the specified fiery impact.
  • The sauce stream connecting the bun and burger looks slightly unnatural.

Verdict: GPT Image 2 followed the prompt precisely, delivering high-quality, glowing fiery text and a detailed, appetizing burger. While Wan 2.6 has a nice atmospheric background, it failed to apply the requested fiery styling to the secondary text and price starburst, making it look less like a cohesive professional advertisement.

Chalkboard Menu

Text-to-Image

“Handwritten-style chalkboard menu in a cozy café, all text rendered in the exact same realistic chalk handwriting style with natural variations in letter size, slight slant, and chalk texture — no printed or digital fonts anywhere on the board. Title at the top in elegant cursive chalk handwriting: ‘TODAY’S SPECIALS – APRIL 30, 2026’. Below it, three menu items also in the same handwritten chalk style: ‘Truffle Mushroom Risotto – $24’, ‘Grilled Octopus with Lemon & Herbs – $28’, ‘Brown Butter Chocolate Chip Cookies – $9’. At the very bottom, smaller text in the identical handwritten chalk style (slightly smaller but still clearly legible with the same handwriting characteristics): ‘All items made fresh daily • Ask about our gluten-free options’. Warm ambient café lighting, visible chalk dust and smudges, realistic handwriting imperfections, no clean printed text anywhere.”

GPT Image 2
Wan 2.6

AI Judge Analysis

GPT Image 2

  • + Excellent text legibility and accuracy
  • + Realistic chalk texture with fine details
  • + Natural, consistent handwriting style across all items
  • Composition is a bit static and centered
  • The chalk holder on the ledge looks slightly blended into the wood

Wan 2.6

  • + Dynamic composition with a more authentic cafe atmosphere
  • + Great lighting and smudge effects on the chalkboard
  • + Higher visual interest with the 'messy' chalk dust
  • The letter 'P' in 'Specials' is poorly formed
  • Some inconsistencies in font weight between the title and the items

Verdict: GPT Image 2 provides a much cleaner and more accurate execution of the text, following the complex menu instructions perfectly. While Wan 2.6 has a more artistic and moody aesthetic with realistic chalk smudges, its letter formation is slightly more erratic and less uniform in style than GPT Image 2.

The Reversed Rodeo

Text-to-Image

“Horse riding astronaut in space — horse on top, not vice versa. Surreal, highly detailed, cinematic.”

GPT Image 2
Wan 2.6

AI Judge Analysis

GPT Image 2

  • + Excellent adherence to the logic-defying prompt request of a horse on top of an astronaut.
  • + Highly detailed texture on the spacesuit and lunar surface.
  • + Creative interpretation of the harness and saddle setup for the horse.
  • The astronaut's hands/gloves have an anatomically incorrect number of fingers (6 or more).
  • The horse's front legs/hooves are awkwardly integrated into the harness handles.

Wan 2.6

  • + Beautiful cinematic lighting and vibrant nebula colors.
  • + Solid composition with dynamic movement and high visual appeal.
  • + Technically well-executed rendered details on the horse's mane and suit.
  • Completely failed the negative constraint/specific instruction for the horse to be on top.
  • Cliches interpretation of the prompt that ignores the primary surreal requirement.

Verdict: GPT Image 2 successfully followed the difficult and specific instruction to place the horse on top of the astronaut, creating a truly surreal image. Wan 2.6 ignored the core logical inversion requested and produced a standard, albeit beautiful, astronaut-riding-horse image. Despite an extra finger on the glove, GPT Image 2 is the clear winner for prompt adherence.

Outfit Transfer Challenge

Editing
Edit instruction

“Use Image 1 as the base person. Dress them in the exact elaborate outfit from Image 2 (including all layers, accessories, jewelry, and shoes). Carefully adapt the clothing to the body shape and pose in Image 1 while maintaining realistic fabric behavior, correct proportions, and perfect lighting/shadow matching. Keep the person’s exact face, hair, and background completely unchanged.”

Source
GPT Image 2
Wan 2.6

AI Judge Analysis

GPT Image 2

  • + Excellent preservation of the subject's face, hair, and vitiligo patterns
  • + High fidelity to the specific outfit from Image 2, including the plaid scarf pattern and watch
  • + Successfully expanded the frame downwards to accommodate the full outfit and pose
  • The lighting on the coat is slightly flatter than the original outdoor scene

Wan 2.6

  • + Successfully applied sunglasses found in several reference outfits
  • + Integrates the coat with realistic lighting and shadows
  • Significant loss of detail in the subject's face and eyes
  • Changed the scarf color/pattern significantly from the reference image
  • Cropped the bottom of the image, losing the jeans and hand-in-pocket detail requested

Verdict: GPT Image 2 is the clear winner as it perfectly preserves the identity of the person from Image 1—including the specific vitiligo patterns—while accurately transferring the exact plaid scarf and navy coat from Image 2. Wan 2.6 fails to maintain the correct scarf pattern and obscures the person's face with poorly integrated sunglasses, missing the requirement to keep the person's face unchanged.

The Capybara Taxi Driver

Text-to-Image

“Photorealistic scene inside a yellow New York taxi at night. A capybara is driving, wearing a yellow taxi driver cap and a dark jacket. It has a calm, professional expression and both front paws on the steering wheel. In the back seat sits a human businesswoman in a coat, looking at her phone with a completely normal, bored expression (as if this is just another normal ride). Through the windows you can see the streets of Manhattan at night with blurred lights. Realistic taxi interior, photorealistic, detailed fur and fabric, 35mm lens, night lighting with reflections, shallow depth of field.”

GPT Image 2
Wan 2.6

AI Judge Analysis

GPT Image 2

  • + Excellent photorealism in textures, especially the capybara's fur and the jacket fabric.
  • + Cinematic lighting and composition that feels like a high-end film still.
  • + The scale of the capybara relative to the steering wheel and seat is very convincing.
  • The passenger's face is slightly blurry and lacks fine detail.
  • The perspective makes the capybara appear extremely large, almost filling the front of the car.

Wan 2.6

  • + Provides a clear view of both the driver and the passenger as requested.
  • + Captures the 'bored' expression of the businesswoman very effectively.
  • + Includes realistic details like rain on the windshield and a taxi roof light.
  • The passenger is sitting in the front passenger seat instead of the back seat.
  • The capybara's hands/paws on the wheel look a bit distorted and unnatural.
  • The lighting is flatter and less cinematic compared to the other model.

Verdict: GPT Image 2 is the superior image due to its exceptional photorealistic textures and atmosphere, creating a more immersion-breaking sense of realism. While Wan 2.6 captures the expressions well, it fails the spatial prompt instruction by placing the passenger in the front seat and lacks the fine detail seen in GPT Image 2.

The Halloween Invitation

Text-to-Image

“Vintage gothic Halloween party invitation. Dark parchment poster, spooky border with webs and thorns, central glowing jack-o-lantern, bats, twisted trees, moody night sky. Add elegant gothic title text saying "Halloween Party Invitation", a small scroll banner saying "You are invited to a night of frights", and event details at the bottom: Date: 30.10.2026 Time: 7pm Location: The Arches, NYC Spooky but polished, cinematic lighting, square format.”

GPT Image 2
Wan 2.6

AI Judge Analysis

GPT Image 2

  • + Excellent typography with perfect spelling in all requested text fields
  • + Superior adherence to the vintage parchment aesthetic
  • + Intricate border detail including skulls, thorns, and webs that feel naturally integrated
  • The color palette is very monochromatic compared to Model B

Wan 2.6

  • + Strong cinematic lighting with vibrant orange glow from the jack-o-lantern
  • + Good use of depth and contrast with the blue night sky
  • + Clear representation of twisted trees mentioned in the prompt
  • The typography is less 'elegant gothic' and more standard digital font
  • The border feels less like a cohesive vintage invitation and more like an overlay
  • Slightly less 'polished' look in the text rendering compared to Model A

Verdict: GPT Image 2 (Model A) is the superior choice for a professional invitation as it handled the complex text requirements perfectly and maintained a consistent vintage gothic aesthetic throughout. While Wan 2.6 (Model B) featured more vibrant, cinematic lighting and a better depiction of the sky, its overall execution felt less like a cohesive printed poster and more like a modern digital illustration.

Isometric Miniature Diorama Scenes

Text-to-Image

“Create a clear, 45° top-down isometric miniature 3D cartoon scene of Japan's signature dish: sushi, with soft refined textures, realistic PBR materials, gentle lighting, on a small raised diorama base with minimal garnish and plate. Solid light blue background. At top-center: 'JAPAN' in large bold text, 'SUSHI' below it, small flag icon. Perfectly centered, ultra-clean, high-clarity, square format.”

GPT Image 2
Wan 2.6

AI Judge Analysis

GPT Image 2

  • + Excellent 3D text rendering with shadows and outlines.
  • + High-detail models for the sushi, base, and decorative elements.
  • + Rich adherence to all prompt elements including the flag icon and complex lighting.
  • Includes more garnish and background elements than the requested 'minimal' amount.

Wan 2.6

  • + Captures the 'minimal' aesthetic requested in the prompt perfectly.
  • + Accurate isometric layout and solid blue background.
  • + Clean, soft textures that match the 'miniature 3D cartoon' style.
  • Text rendering is flat and lacks the high-clarity 3D finish seen in the other model.
  • The sushi models look slightly more generic and less like realistic PBR materials.

Verdict: GPT Image 2 is the superior choice for its professional 3D graphic design quality, featuring excellent text integration and intricate material details. While Wan 2.6 adhered better to the 'minimal' keyword, its flat text and simpler modeling make it look less refined than the highly polished diorama presented by GPT Image 2.

Adorable Baby Animals in Sunny Meadow

Text-to-Image

“Hyper-photorealistic scene of fluffy baby animals—a golden retriever puppy, tabby kitten, baby bunny, and red fox kit—with big expressive eyes and ultra-detailed soft fur, playfully chasing butterflies and tumbling together in a lush wildflower meadow, warm golden sunrise light with god rays and dew sparkles, joyful wholesome vibe, 8K masterpiece.”

GPT Image 2
Wan 2.6

AI Judge Analysis

GPT Image 2

  • + Excellent character coherence where all animals feel like they occupy the same space
  • + The fox and kitten exhibit highly detailed fur textures and realistic anatomy
  • + Subtle but effective use of god rays and backlighting that creates a warm atmosphere
  • The bunny is partially hidden behind the grass, losing some of the 'playful' impact
  • Some butterflies in the background are low-resolution or generic

Wan 2.6

  • + Fantastic dynamic composition with all animals clearly visible and interacting
  • + Beautiful 'dew sparkles' effect through bokeh and light refraction in the foreground
  • + Stronger god rays that perfectly match the requested sunrise light
  • The kitten's facial structure and eyes look slightly unnatural/uncanny
  • Floating dandelion seeds or artifacts appear a bit cluttered in the upper portion

Verdict: Both models followed the prompt exceptionally well, capturing all four specific animals and the requested lighting. GPT Image 2 has slightly better anatomical realism for the kitten and puppy, while Wan 2.6 captures the 'playful tumbling' vibe and 'dew sparkles' much more effectively with a more dynamic layout. Wan 2.6 is the winner for its superior adherence to the magical, atmospheric elements of the prompt like the sparkles and rays.

Vintage Cafe Logo

Text-to-Image

“Vintage minimalist restaurant logo for "Caffè Florian", retro cloche dome with steam and "Est. 1720" banner, classic typography, warm brown and cream tones, subtle texture on light background, vector emblem style.”

GPT Image 2
Wan 2.6

AI Judge Analysis

GPT Image 2

  • + Excellent typography with clean, professionally rendered text.
  • + High-quality etching detail on the cloche and banner.
  • + Superb layout with a cohesive frame and balanced composition.
  • The detail level is very high, pushing slightly away from the 'minimalist' request.

Wan 2.6

  • + Successfully adopts a simpler, more minimalist vector style.
  • + Accurate adherence to the warm brown and cream color scheme.
  • The banner is very small and awkwardly placed coming out of the cloche.
  • The steam icon is overly simplified and lacks the requested vintage feel.
  • The background texture looks like digital grunge brushes rather than subtle paper texture.

Verdict: GPT Image 2 is the superior design, offering a professional-grade vintage emblem with perfect typography and sophisticated illustration details. While Wan 2.6 is more 'minimalist' in its shapes, the composition remains clunky, specifically regarding the tiny, poorly placed banner and less refined vector work.

Apollo 11: Journey to Tranquility

Text-to-Image

“Create a clean, modern vector infographic poster about the Apollo 11 mission. NASA-inspired palette (navy, white, muted red, light gray). Flat-vector style, crisp lines, consistent iconography, subtle gradients only. Steps (stop at landing): 1. Launch (Saturn Vicon) 2. Earth Orbit (Earth + orbit ring icon) 3. Translunar (trajectory arc icon) 4. Lunar Orbit (Moon + orbit ring icon) 5. Descent (lunar module descending icon) 6. Landing (lunar module on the surface icon) Small supporting elements (minimal text): • Crew strip: three silhouette icons with only last names: Armstrong, Aldrin, Collins. • Landing site marker: Moon pin labeled "Tranquility" only. Layout constraints: generous margins, large readable labels, clean background with subtle stars. Vector-only, print-poster look, high resolution.”

GPT Image 2
Wan 2.6

AI Judge Analysis

GPT Image 2

  • + Excellent adherence to the multi-step infographic structure and specific iconography requests.
  • + Highly accurate text rendering for the title, steps, crew names, and landing site.
  • + Strong visual composition with a professional, clean vector aesthetic that fits the NASA theme.
  • The lunar module in step 5 contains slightly dense detail for a 'flat vector' style.

Wan 2.6

  • + Successfully applied the requested navy, white, and red color palette.
  • + Includes the requested crew names in a legible font.
  • Completely failed to generate the six-step infographic workflow requested in the prompt.
  • The background has a textured, fabric-like appearance instead of a clean digital vector finish.
  • Lacks the specific icons (Saturn V, Earth, Moon, etc.) requested.

Verdict: GPT Image 2 is the clear winner as it followed every complex instruction in the prompt, including specific infographic steps, iconography, and names. In contrast, Wan 2.6 produced a very simple poster that ignored the core requirement of a 6-step infographic process.

Next steps

Explore each model