OpenAI's state-of-the-art image generation model with better instruction following and adherence to prompts
Settled by community votes across 6 shared challenges, with an AI judge weighing in on each.
GPT Image 1.5
#7 of 44 in Text-to-Image
Not enough comparable category data
The chart appears once both models have ratings across at least three shared arena categories.
Wan 2.7 Pro
#29 of 44 in Text-to-Image
Where the votes landed
GPT Image 1.5
100.0%
win rate
Ties
0.0%
Wan 2.7 Pro
0.0%
win rate
Challenge by challenge
The strongest take from each model on every shared challenge, with the AI judge's read.
Magic Burger Explosion: Fiery Photorealism Challenge
Text-to-Image“Ad for 'Magic Burger'. Dynamic, exploded burger with all components (bun, patty, cheese, lettuce, tomato, sauce) suspended in mid-air. Emphasize photorealistic detail and a sense of motion. Dark, fiery background with glowing embers. Integrate text: 'MAGIC BURGER' as a prominent title, 'LIMITED TIME ONLY' as a secondary message, and '€6.99' in a starburst, all rendered with a fiery, glowing effect.”
AI Judge Analysis
GPT Image 1.5
- + Excellent typography with a fiery, glowing effect that perfectly matches the prompt.
- + Superior textures on the meat and bun create a highly realistic and appetizing look.
- + Dynamic composition with a strong sense of heat and energy.
- − The 'exploded' effect is slightly more compressed than Model B.
- − The bottom bun looks a bit over-charred.
Wan 2.7 Pro
- + Clearer separation of individual burger components in the 'exploded' view.
- + Clean, minimalist background that makes the ingredients pop.
- − Failed to include 'LIMITED TIME ONLY' and '€6.99' in a starburst as requested.
- − The 'MAGIC BURGER' text is plain orange and lacks the requested fiery, glowing effect.
- − The lighting on the ingredients feels a bit artificial compared to the environment.
Verdict: GPT Image 1.5 is the clear winner as it followed every part of the complex text prompt, including the secondary taglines and the specific starburst price element. While Wan 2.7 Pro provided a nice exploded view of the ingredients, it failed to render most of the required text and lacked the gritty, photorealistic texture of the first image.
Chalkboard Menu
Text-to-Image“Handwritten-style chalkboard menu in a cozy café, all text rendered in the exact same realistic chalk handwriting style with natural variations in letter size, slight slant, and chalk texture — no printed or digital fonts anywhere on the board. Title at the top in elegant cursive chalk handwriting: ‘TODAY’S SPECIALS – APRIL 30, 2026’. Below it, three menu items also in the same handwritten chalk style: ‘Truffle Mushroom Risotto – $24’, ‘Grilled Octopus with Lemon & Herbs – $28’, ‘Brown Butter Chocolate Chip Cookies – $9’. At the very bottom, smaller text in the identical handwritten chalk style (slightly smaller but still clearly legible with the same handwriting characteristics): ‘All items made fresh daily • Ask about our gluten-free options’. Warm ambient café lighting, visible chalk dust and smudges, realistic handwriting imperfections, no clean printed text anywhere.”
AI Judge Analysis
GPT Image 1.5
- + Excellent chalk texture and smudging effects for a realistic feel.
- + Perfect spelling and completion of the unfinished prompt text.
- + Consistent and authentic handwriting style that looks truly manual.
- − Limited background context compared to the cozy café requested.
Wan 2.7 Pro
- + Beautiful composition including the café interior and framing.
- + Clean, legible text rendering with high contrast.
- − Text appears more like a digital font than natural chalk handwriting.
- − Severe repetition error with 'Grilled Octopus' line appearing twice.
- − Failed to produce the 'elegant cursive' style for the title.
Verdict: GPT Image 1.5 is the clear winner as it followed the technical requirements of the handwriting prompt perfectly, including the realistic chalk texture and the correct completion of the final menu item. Wan 2.7 Pro produced a more aesthetically pleasing background but failed significantly on the text by repeating a line and using characters that look like a digital font rather than natural handwriting.
Pose & Character Mashup
Editing“Use Image 1 as the exact pose reference and Image 2 as the character reference. Recreate the person/character from Image 2 in the exact dynamic pose and body position from Image 1. Keep the exact face, hair, clothing style/details, and expression from Image 2. Match the lighting and environment of Image 1. The final image must show the character from Image 2 performing the precise action/pose from Image 1 with perfect anatomy and natural integration.”
AI Judge Analysis
GPT Image 1.5
- + Successfully integrated the man from Image 2 into the scene.
- + Maintained character attributes including sunglasses, scarf, and specific clothing details.
- + Accurately replicated the yellow background and red ottoman from Image 1.
- − Failed to match the extreme 'bent over' torso angle and head tilt of the reference pose.
- − The anatomy of the feet is significantly distorted with too many toes.
- − The right hand contains an extra finger and lacks anatomical consistency.
Wan 2.7 Pro
- + Maintained the exact original image composition and pose.
- − Completely failed the edit instruction by returning a slightly blurred version of Image 1.
- − Failed to include any elements from the character reference (Image 2).
Verdict: GPT Image 1.5 successfully followed the complex instruction of combining a character and a pose, though it struggled with the fine details of hands, feet, and the extreme flexibility of the original pose. Wan 2.7 Pro failed the task entirely, providing an output that was essentially a duplicate of the pose reference image with no character changes.
Outfit Transfer Challenge
Editing“Use Image 1 as the base person. Dress them in the exact elaborate outfit from Image 2 (including all layers, accessories, jewelry, and shoes). Carefully adapt the clothing to the body shape and pose in Image 1 while maintaining realistic fabric behavior, correct proportions, and perfect lighting/shadow matching. Keep the person’s exact face, hair, and background completely unchanged.”
AI Judge Analysis
GPT Image 1.5
- + Excellent transfer of the specific outfit components including the coat, scarf, glasses, and watch.
- + Maintains the subject's unique vitiligo patterns and facial features accurately.
- + Perfectly aligns the pose and lighting of the clothing with the original subject.
- − Crop is tighter than the original source image, losing some background context.
- − The skin under the sunglasses doesn't perfectly match the original eye area's texture.
Wan 2.7 Pro
- + Preserves the full composition and framing of the original scene.
- + Maintains the full body pose including the placement of the feet on the sand.
- − Completely failed to use the correct outfit from Image 2, generating a generic gold-patterned blazer instead.
- − Artifacts on the hands such as merging fingers and inconsistent skin patterns.
- − The face is distorted and loses the specific likeness of the person in the source image.
Verdict: GPT Image 1.5 is the clear winner as it followed the complex instruction to transfer a specific outfit from one image to another with high fidelity. Wan 2.7 Pro completely ignored the visual reference of the outfit, providing a random suit instead, and also suffered from significant anatomical artifacts and loss of subject likeness.
The Capybara Taxi Driver
Text-to-Image“Photorealistic scene inside a yellow New York taxi at night. A capybara is driving, wearing a yellow taxi driver cap and a dark jacket. It has a calm, professional expression and both front paws on the steering wheel. In the back seat sits a human businesswoman in a coat, looking at her phone with a completely normal, bored expression (as if this is just another normal ride). Through the windows you can see the streets of Manhattan at night with blurred lights. Realistic taxi interior, photorealistic, detailed fur and fabric, 35mm lens, night lighting with reflections, shallow depth of field.”
AI Judge Analysis
GPT Image 1.5
- + Excellent texture on the capybara's fur and the taxi dashboard.
- + Perfect adherence to the requested composition with the passenger in the background.
- + Highly realistic lighting and cinematic bokeh.
- − The capybara's paws look slightly more like badger or otter paws than true capybara feet.
Wan 2.7 Pro
- + Natural-looking hands on the steering wheel.
- + Clear, high-resolution rendering of the urban background.
- − Failed to place the passenger in the back seat, putting her in the passenger seat instead.
- − The capybara's head is not naturally attached to the body, appearing like a mask or a floating element.
- − The composition feels less like a real interior shot due to the wide side-angle.
Verdict: GPT Image 1.5 is the clear winner as it correctly followed the spatial instructions to place the businesswoman in the backseat, creating a more cohesive and humorous scene. Wan 2.7 Pro failed on the composition by placing the passenger in the front seat and produced a disjointed image where the capybara's head does not convincingly align with the driver's body.
The Halloween Invitation
Text-to-Image“Vintage gothic Halloween party invitation. Dark parchment poster, spooky border with webs and thorns, central glowing jack-o-lantern, bats, twisted trees, moody night sky. Add elegant gothic title text saying "Halloween Party Invitation", a small scroll banner saying "You are invited to a night of frights", and event details at the bottom: Date: 30.10.2026 Time: 7pm Location: The Arches, NYC Spooky but polished, cinematic lighting, square format.”
AI Judge Analysis
GPT Image 1.5
- + Excellent adherence to the 'vintage dark parchment' aesthetic
- + Text is perfectly rendered and integrated into the design
- + Cinematic lighting and moody atmosphere are very effective
- − The thorns and webs border is a bit messy and cluttered
Wan 2.7 Pro
- + Clean, illustrative style with good composition
- + Accurate text rendering for all requested fields
- + Creative border elements like skulls and roses
- − Lacks the requested 'dark parchment' and vintage gothic feel, appearing more like a modern digital illustration
- − The central pumpkin has a strange candle-nose artifact
Verdict: GPT Image 1.5 perfectly captures the requested vintage gothic mood and cinematic lighting, creating a cohesive and professional-looking invitation. While Wan 2.7 Pro followed all text instructions correctly, its bright, clean illustrative style misses the 'dark parchment' and 'spooky' atmosphere requested in the prompt.
Explore each model
Alibaba's Wan 2.7 Pro image generation and editing model with higher-quality outputs and support for 4K image generation