Alibaba's Wan 2.7 image generation and editing model for text-to-image, reference-guided generation, and instruction-based image edits
Settled by community votes across 4 shared challenges, with an AI judge weighing in on each.
Wan 2.7
#34 of 44 in Text-to-Image
Not enough comparable category data
The chart appears once both models have ratings across at least three shared arena categories.
Wan 2.7 Pro
#29 of 44 in Text-to-Image
Where the votes landed
Wan 2.7
0.0%
win rate
Ties
50.0%
Wan 2.7 Pro
50.0%
win rate
Challenge by challenge
The strongest take from each model on every shared challenge, with the AI judge's read.
Chalkboard Menu
Text-to-Image“Handwritten-style chalkboard menu in a cozy café, all text rendered in the exact same realistic chalk handwriting style with natural variations in letter size, slight slant, and chalk texture — no printed or digital fonts anywhere on the board. Title at the top in elegant cursive chalk handwriting: ‘TODAY’S SPECIALS – APRIL 30, 2026’. Below it, three menu items also in the same handwritten chalk style: ‘Truffle Mushroom Risotto – $24’, ‘Grilled Octopus with Lemon & Herbs – $28’, ‘Brown Butter Chocolate Chip Cookies – $9’. At the very bottom, smaller text in the identical handwritten chalk style (slightly smaller but still clearly legible with the same handwriting characteristics): ‘All items made fresh daily • Ask about our gluten-free options’. Warm ambient café lighting, visible chalk dust and smudges, realistic handwriting imperfections, no clean printed text anywhere.”
AI Judge Analysis
Wan 2.7
- + Excellent spelling accuracy for all menu items and dates.
- + Text has a slight volumetric shadow that makes it pop from the board.
- + Good chalkboard texture with realistic smudges.
- − The font style feels somewhat digital and uniform despite the prompt's request for natural variations.
- − Underline decorative elements are a bit messy and inconsistent.
Wan 2.7 Pro
- + Perfect spelling and layout of the requested text.
- + Handwriting style looks slightly more organic with varied character heights.
- + Clean composition with well-spaced items.
- − The 'chalk' lacks internal texture, looking more like a solid white digital brush.
- − Title cursive is very basic and lacks the 'elegant' flair requested.
Verdict: Both models performed exceptionally well on the difficult task of rendering specific, long-form text with perfect spelling. Wan 2.7 is preferred because the text has a more realistic 'chalk on board' depth and better integration with the background smudges, whereas Wan 2.7 Pro's text looks more like a flat digital overlay.
Outfit Transfer Challenge
Editing“Use Image 1 as the base person. Dress them in the exact elaborate outfit from Image 2 (including all layers, accessories, jewelry, and shoes). Carefully adapt the clothing to the body shape and pose in Image 1 while maintaining realistic fabric behavior, correct proportions, and perfect lighting/shadow matching. Keep the person’s exact face, hair, and background completely unchanged.”
AI Judge Analysis
Wan 2.7
- + Excellent preservation of the person's face, hair, and specific skin markings (vitiligo).
- + Successfully added complex layers including jewelry, a waistcoat, and a full-length coat.
- − Completely failed to use the specific outfit from Image 2 (peacoat, plaid scarf, jeans).
- − Generated a high-fantasy/baroque outfit instead of the requested street style.
Wan 2.7 Pro
- + Preserved the identity and background of Image 1 very effectively.
- + Includes realistic shadow and lighting integration for the new clothing.
- − Failed to dress the person in the exact outfit from Image 2, providing a gold patterned blazer instead.
- − Ignored key accessories from the source like the plaid scarf and sunglasses.
Verdict: Both models failed the core instruction of dressing the subject in the *exact* outfit from Image 2. While both Wan 2.7 and Wan 2.7 Pro successfully preserved the identity of the person in Image 1, they both hallucinated new elaborate outfits (fantasy-style for Wan 2.7 and a patterned blazer for Wan 2.7 Pro) instead of using the peacoat and scarf provided in the source. Wan 2.7 is slightly better in terms of visual richness and layering, but both are unsuccessful in prompt adherence regarding the specific clothing transferred.
The Capybara Taxi Driver
Text-to-Image“Photorealistic scene inside a yellow New York taxi at night. A capybara is driving, wearing a yellow taxi driver cap and a dark jacket. It has a calm, professional expression and both front paws on the steering wheel. In the back seat sits a human businesswoman in a coat, looking at her phone with a completely normal, bored expression (as if this is just another normal ride). Through the windows you can see the streets of Manhattan at night with blurred lights. Realistic taxi interior, photorealistic, detailed fur and fabric, 35mm lens, night lighting with reflections, shallow depth of field.”
AI Judge Analysis
Wan 2.7
- + Excellent fur texture on the capybara.
- + Captures the bored, oblivious expression of the passenger perfectly.
- + High level of photographic realism in the lighting and reflections.
- − The passenger is sitting in the front passenger seat, failing the 'back seat' part of the prompt.
- − The paw placement on the steering wheel looks slightly detached/less integrated.
Wan 2.7 Pro
- + Natural integration of the capybara's paws on the steering wheel.
- + The passenger is correctly positioned further back in the vehicle layout.
- + Vibrant and clear background street lights that enhance the New York setting.
- − The passenger is still physically in the front seat area despite being deeper in the frame, failing the 'back seat' instruction.
- − The capybara's hands look somewhat more primate-like than capybara-like.
Verdict: Both models struggle to place the businesswoman specifically in the back seat, placing her in the front passenger seat instead. Wan 2.7 Pro offers slightly better compositional depth and more realistic interaction with the steering wheel, while Wan 2.7 has superior fur detailing on the capybara.
The Halloween Invitation
Text-to-Image“Vintage gothic Halloween party invitation. Dark parchment poster, spooky border with webs and thorns, central glowing jack-o-lantern, bats, twisted trees, moody night sky. Add elegant gothic title text saying "Halloween Party Invitation", a small scroll banner saying "You are invited to a night of frights", and event details at the bottom: Date: 30.10.2026 Time: 7pm Location: The Arches, NYC Spooky but polished, cinematic lighting, square format.”
AI Judge Analysis
Wan 2.7
- + Excellent text rendering with no spelling errors in any section
- + The border design with thorns and skulls is very detailed and fits the gothic theme well
- + Clearer, more readable event details at the bottom
- − The transition between the central illustration and the cream parchment is very abrupt
- − The stars/points around the event details are a bit simplistic
- − Central jack-o-lantern has more cartoonish lighting compared to the surrounding scene
Wan 2.7 Pro
- + Slightly more integrated parchment texture and color palette
- + Atmospheric use of negative space in the border design
- + Stronger aesthetic cohesion between the central illustration and the outer frame
- − One bat at the bottom is just a generic shape without wings
- − The 'Location' text is slightly less crisp than Model A
- − Text alignment in the banner is slightly off-center
Verdict: Both Wan 2.7 and Wan 2.7 Pro handled this complex text-heavy prompt exceptionally well, following all instructions including specific dates and slogans. Wan 2.7 is slightly better for a finished product due to the superior clarity and alignment of its typography, whereas Wan 2.7 Pro has a more unified aesthetic but suffers from minor inconsistencies in decorative elements.
Explore each model
Alibaba's Wan 2.7 Pro image generation and editing model with higher-quality outputs and support for 4K image generation