OpenAI's state-of-the-art image generation model with better instruction following and adherence to prompts
Settled by community votes across 3 shared challenges, with an AI judge weighing in on each.
GPT Image 1.5
#7 of 44 in Text-to-Image
Not enough comparable category data
The chart appears once both models have ratings across at least three shared arena categories.
Qwen Image 2.0 Pro
#27 of 44 in Text-to-Image
Where the votes landed
GPT Image 1.5
66.7%
win rate
Ties
0.0%
Qwen Image 2.0 Pro
33.3%
win rate
Challenge by challenge
The strongest take from each model on every shared challenge, with the AI judge's read.
Chalkboard Menu
Text-to-Image“Handwritten-style chalkboard menu in a cozy café, all text rendered in the exact same realistic chalk handwriting style with natural variations in letter size, slight slant, and chalk texture — no printed or digital fonts anywhere on the board. Title at the top in elegant cursive chalk handwriting: ‘TODAY’S SPECIALS – APRIL 30, 2026’. Below it, three menu items also in the same handwritten chalk style: ‘Truffle Mushroom Risotto – $24’, ‘Grilled Octopus with Lemon & Herbs – $28’, ‘Brown Butter Chocolate Chip Cookies – $9’. At the very bottom, smaller text in the identical handwritten chalk style (slightly smaller but still clearly legible with the same handwriting characteristics): ‘All items made fresh daily • Ask about our gluten-free options’. Warm ambient café lighting, visible chalk dust and smudges, realistic handwriting imperfections, no clean printed text anywhere.”
AI Judge Analysis
GPT Image 1.5
- + Excellent chalk texture with natural-looking smudges and dust.
- + Very consistent handwriting style across the whole board.
- + High fidelity to the requested text and date.
- − The handwriting appears slightly more like a digital brush than a physical person holding chalk.
- − The composition is a bit flat with no background context of the 'cozy café'.
Qwen Image 2.0 Pro
- + Stronger sense of place with the cafe background and lighting.
- + The handwriting looks very authentic to a physical person writing with chalk, including natural variations in pressure.
- + Excellent composition with text wrapping that feels realistic for a small board.
- − The lettering is slightly less sharp and crisp compared to Image A.
- − There is a small amount of digital artifacting around some of the thinner lines.
Verdict: Both models followed the complex text prompt perfectly, including the specific date and menu items. GPT Image 1.5 excels in texture and clarity, providing a very clean professional aesthetic, while Qwen Image 2.0 Pro provides a more convincing 'hand-written' feel and better environmental context of a café. Qwen is the winner because its handwriting feels more organic and the overall composition is more engaging.
The Capybara Taxi Driver
Text-to-Image“Photorealistic scene inside a yellow New York taxi at night. A capybara is driving, wearing a yellow taxi driver cap and a dark jacket. It has a calm, professional expression and both front paws on the steering wheel. In the back seat sits a human businesswoman in a coat, looking at her phone with a completely normal, bored expression (as if this is just another normal ride). Through the windows you can see the streets of Manhattan at night with blurred lights. Realistic taxi interior, photorealistic, detailed fur and fabric, 35mm lens, night lighting with reflections, shallow depth of field.”
AI Judge Analysis
GPT Image 1.5
- + Excellent photorealistic texture on the capybara's fur.
- + Stronger cinematic lighting and depth of field that makes the night scene feel authentic.
- + Perfectly captures the 'bored' expression of the passenger in the background.
- − The capybara's paws look slightly more like humanoid fingers than natural capybara feet.
Qwen Image 2.0 Pro
- + Successfully places both the driver and passenger in a clear, wide composition.
- + Accurately represents the requested yellow taxi driver cap and dark jacket.
- − The passenger appears to be in the front passenger seat rather than the back seat as requested.
- − The capybara's hand on the wheel has a greenish, sickly skin tone and odd anatomy.
- − Overall lighting feels a bit flat and less 'photorealistic' compared to the competition.
Verdict: GPT Image 1.5 is the clear winner as it masterfully handles the depth of field and texture, creating a truly cinematic and photorealistic result. While Qwen Image 2.0 Pro captures the elements of the prompt, it fails the spatial requirement of putting the passenger in the back seat and suffers from jarring anatomical errors on the driver's hands.
The Halloween Invitation
Text-to-Image“Vintage gothic Halloween party invitation. Dark parchment poster, spooky border with webs and thorns, central glowing jack-o-lantern, bats, twisted trees, moody night sky. Add elegant gothic title text saying "Halloween Party Invitation", a small scroll banner saying "You are invited to a night of frights", and event details at the bottom: Date: 30.10.2026 Time: 7pm Location: The Arches, NYC Spooky but polished, cinematic lighting, square format.”
AI Judge Analysis
GPT Image 1.5
- + Excellent adherence to the 'vintage dark parchment' texture and sepia-toned gothic aesthetic.
- + Flawless text rendering for both the large title and the small event details.
- + Superior composition with a more natural integration of the thorny border and misty graveyard background.
- − The bats are somewhat simplified silhouettes compared to Model B.
Qwen Image 2.0 Pro
- + Features detailed, characterful bats and a unique green internal glow for the pumpkin.
- + Clear typography and includes all requested text elements correctly.
- − The digital art style feels more modern and clean than the requested 'vintage' and 'dark parchment' aesthetic.
- − The scroll banner looks slightly overlaid and less integrated into the environment.
Verdict: GPT Image 1.5 followed the stylistic cues of the prompt much more effectively, producing a cohesive 'vintage gothic' piece with beautiful textures and a gloomy atmosphere. While Qwen Image 2.0 Pro handled all text perfectly and had more detailed animal figures, it felt too polished and digital, lacking the cinematic grit found in GPT Image 1.5.
Explore each model
Alibaba's Qwen Image 2.0 Pro model offering higher quality image generation with enhanced detail and accuracy