GPT Image 2 vs Wan 2.7

Head-to-head across 5 challenges

GPT Image 2

50.0%

win rate

Ties

0.0%

Wan 2.7

50.0%

win rate

50.0% 0.0% ties 50.0%

Challenge Results

Chalkboard Menu

Text-to-Image

“Handwritten-style chalkboard menu in a cozy café, all text rendered in the exact same realistic chalk handwriting style with natural variations in letter size, slight slant, and chalk texture — no printed or digital fonts anywhere on the board. Title at the top in elegant cursive chalk handwriting: ‘TODAY’S SPECIALS – APRIL 30, 2026’. Below it, three menu items also in the same handwritten chalk style: ‘Truffle Mushroom Risotto – $24’, ‘Grilled Octopus with Lemon & Herbs – $28’, ‘Brown Butter Chocolate Chip Cookies – $9’. At the very bottom, smaller text in the identical handwritten chalk style (slightly smaller but still clearly legible with the same handwriting characteristics): ‘All items made fresh daily • Ask about our gluten-free options’. Warm ambient café lighting, visible chalk dust and smudges, realistic handwriting imperfections, no clean printed text anywhere.”

GPT Image 2

Wan 2.7

AI Judge Analysis

GPT Image 2

+ Features highly realistic chalk texture with dusty, varied strokes
+ Successfully renders all text accurately as requested
+ Achieves an authentic handwritten cursive style for the title

− The lighting in the background is slightly dark compared to the subject

Wan 2.7

+ Perfect text accuracy and alignment
+ Clean, high-contrast composition

− Text appears as a clean digital font rather than natural chalk handwriting
− Fails to provide the 'elegant cursive' requested for the title
− Lacks the gritty, dusty realism of actual chalk on a board

Verdict: GPT Image 2 captured the handwritten chalk aesthetic perfectly, showing realistic variations in pressure and texture that look authentic to a cafe chalkboard. Wan 2.7 produced very legible text but failed the prompt's requirement for a realistic handwriting style, resulting in a 'digital font' look that lacks character.

The Reversed Rodeo

Text-to-Image

“Horse riding astronaut in space — horse on top, not vice versa. Surreal, highly detailed, cinematic.”

GPT Image 2

Wan 2.7

AI Judge Analysis

GPT Image 2

+ Perfectly follows the specific role-reversal instruction of the horse on top of the astronaut.
+ High level of texture detail in the astronaut suit and horse fur.
+ Strong surrealist composition that aligns with the prompt's tone.

Wan 2.7

+ Excellent cinematic lighting and galactic background details.
+ High clarity and clean rendering of the horse and astronaut.

− Failed the core prompt instruction to have the horse on top of the astronaut (role reversal).
− The horse and astronaut are not wearing any breathing apparatus despite being in space.

Verdict: GPT Image 2 successfully captured the difficult role-reversal request, showing a horse literally riding/sitting on an astronaut, whereas Wan 2.7 followed a generic 'astronaut on a horse' trope. While Wan 2.7 has a more vibrant background, GPT Image 2 is the superior response for its strict adherence to the surreal instruction 'horse on top, not vice versa'.

Outfit Transfer Challenge

Editing

Edit instruction

“Use Image 1 as the base person. Dress them in the exact elaborate outfit from Image 2 (including all layers, accessories, jewelry, and shoes). Carefully adapt the clothing to the body shape and pose in Image 1 while maintaining realistic fabric behavior, correct proportions, and perfect lighting/shadow matching. Keep the person’s exact face, hair, and background completely unchanged.”

Source

GPT Image 2

Wan 2.7

AI Judge Analysis

GPT Image 2

+ Perfectly replicates the coat, scarf, jeans, and accessories from Image 2.
+ Maintains the subject's face, hair, and vitiligo patterns with high accuracy.
+ Provides high-quality realistic fabric texture and lighting integration.

− The scarf's positioning is slightly stiff and doesn't fully account for the lean against the post.

Wan 2.7

+ Maintains the background and the subject's facial features and skin patterns well.

− Completely failed to use the outfit from Image 2, substituting it with a generic fantasy costume.
− The feet and shoes are poorly rendered and appear distorted into the sand.
− The proportions of the body feel elongated and unnatural compared to the source.

Verdict: GPT Image 2 followed the instructions perfectly, accurately transferring the specific clothing, scarf, and watch from Image 2 onto the subject while preserving his identity. Wan 2.7 ignored the second source image entirely, generating a completely unrelated outfit and introducing anatomical distortions in the legs and feet.

The Capybara Taxi Driver

Text-to-Image

“Photorealistic scene inside a yellow New York taxi at night. A capybara is driving, wearing a yellow taxi driver cap and a dark jacket. It has a calm, professional expression and both front paws on the steering wheel. In the back seat sits a human businesswoman in a coat, looking at her phone with a completely normal, bored expression (as if this is just another normal ride). Through the windows you can see the streets of Manhattan at night with blurred lights. Realistic taxi interior, photorealistic, detailed fur and fabric, 35mm lens, night lighting with reflections, shallow depth of field.”

GPT Image 2

Wan 2.7

50% wins 0% ties 50% wins

AI Judge Analysis

GPT Image 2

+ Excellent photorealistic texture on the capybara's fur
+ Perfectly captures the 'bored' expression of the passenger in the back seat
+ Cinematic lighting and authentic NYC night atmosphere

− The capybara's right paw is rendered somewhat ambiguously on the wheel

Wan 2.7

+ Strong composition showing both the taxi roof sign and interior
+ Good details on the capybara's paws holding the wheel

− The human passenger is sitting in the front passenger seat instead of the 'back seat' as requested
− The lighting on the capybara feels a bit flat compared to the background

Verdict: GPT Image 2 is the superior choice because it correctly places the passenger in the back seat and captures more realistic, cinematic lighting. Wan 2.7 fails on the spatial layout of the prompt by putting the passenger in the front seat, though it does a decent job with the capybara's hands.

The Halloween Invitation

Text-to-Image

“Vintage gothic Halloween party invitation. Dark parchment poster, spooky border with webs and thorns, central glowing jack-o-lantern, bats, twisted trees, moody night sky. Add elegant gothic title text saying "Halloween Party Invitation", a small scroll banner saying "You are invited to a night of frights", and event details at the bottom: Date: 30.10.2026 Time: 7pm Location: The Arches, NYC Spooky but polished, cinematic lighting, square format.”

GPT Image 2

Wan 2.7

AI Judge Analysis

GPT Image 2

+ Excellent atmospheric lighting and cinematic mood.
+ Flawless rendering of all requested text.
+ Highly intricate gothic border with visible thorns and skulls.

− The parchment texture is slightly less literal than Model B.

Wan 2.7

+ Clean, readable layout with a clear parchment aesthetic.
+ Accurate rendering of the banner and all requested text.
+ Follows the framing of the prompt well with twisted trees and bats.

− The art style is more like a modern illustration than a 'vintage gothic' poster.
− Lighting feels flat compared to the requested 'cinematic' style.

Verdict: GPT Image 2 perfectly captures the 'vintage gothic' atmosphere with moody cinematic lighting and high-quality details that make it feel like a professional poster. While Wan 2.7 followed all instructions and text accurately, its illustration style is too clean and cartoonish for the requested 'frightful' theme. GPT Image 2's sophisticated textures and depth make it the clear winner.

GPT Image 2

OpenAI's state-of-the-art image generation model with arbitrary resolution up to 4K and strong instruction following

View Model Arena

Wan 2.7

Alibaba's Wan 2.7 image generation and editing model for text-to-image, reference-guided generation, and instruction-based image edits

View Model Arena