GPT Image 1.5 vs Qwen Image 2512

Head-to-head across 5 challenges

GPT Image 1.5

81.8%

win rate

Ties

0.0%

Qwen Image 2512

18.2%

win rate

81.8% 0.0% ties 18.2%

Challenge Results

Geometric Composition

Text-to-Image

“A glass cube on a wooden table. Inside the cube is a small blue sphere. On top of the cube sits a red book. A green plant is behind the cube, partially visible through the glass. Soft window light from the left.”

GPT Image 1.5
Qwen Image 2512
75% wins 0% ties 25% wins

AI Judge Analysis

GPT Image 1.5

  • + Perfect adherence to the prompt's spatial instructions.
  • + Excellent glass refraction showing the plant through the cube.
  • + High-quality texture on the red book and wooden table.
  • The sphere is quite large relative to the cube, making the 'small' descriptor subjective.
  • The base of the cube looks slightly mirrors-like rather than simple glass.

Qwen Image 2512

  • + Good lighting and realistic material textures.
  • + Follows the prompt for all required elements.
  • Significant optical inconsistency where a second blue sphere appears to be inside the glass wall.
  • The sphere is off-center, making the composition feel less balanced.
  • The glass has a heavy teal tint compared to the clear glass in Model A.

Verdict: GPT Image 1.5 is the clear winner as it correctly handles the transparency and refractions of the glass cube, showing the plant behind it naturally. Qwen Image 2512 suffers from a major hallucination where a duplicate blue sphere is embedded within the glass pane on the right, which ruins the realism of the scene.

Candid Street Photography

Text-to-Image

“A candid street photo of an elderly Japanese man repairing a red bicycle in light rain, reflections on wet pavement, shallow depth of field, 50mm lens, natural skin texture, imperfect framing, motion blur from passing cars, cinematic but realistic, no stylization.”

GPT Image 1.5
Qwen Image 2512
100% wins 0% ties 0% wins

AI Judge Analysis

GPT Image 1.5

  • + Excellent adherence to the 'repairing' action with tools visible on the ground.
  • + Highly realistic lighting and reflections on the wet pavement.
  • + Perfectly captured the candid feel with the subject engrossed in work.
  • The car in the background is sharp rather than showing the requested motion blur.
  • The rain is very subtle, almost difficult to see.

Qwen Image 2512

  • + Strong bokeh and shallow depth of field as requested.
  • + Effective motion blur on the background vehicles.
  • + High facial detail and natural skin texture.
  • The subject is posing for a portrait rather than 'repairing' the bicycle as requested.
  • The bicycle anatomy is slightly warped, particularly the handlebars and frame connection.
  • Minimal visual evidence of rain beyond wet ground.

Verdict: GPT Image 1.5 is the winner because it correctly depicts the 'repairing' action with a candid, storytelling atmosphere, whereas Qwen Image 2512 produces a static portrait that ignores the core activity of the prompt. While Qwen better follows instructions for motion blur, the fundamental failure to show the man working makes it a less successful interpretation of the scene.

Modern Clean Menu

Text-to-Image

“Modern minimalist restaurant menu design, white background with colorful food photos in grid, sections for appetizers/pizza/mains, bold sans-serif fonts, vibrant accents, clean professional layout for casual dining.”

GPT Image 1.5
Qwen Image 2512
100% wins 0% ties 0% wins

AI Judge Analysis

GPT Image 1.5

  • + Excellent text rendering with no spelling errors
  • + Includes specific logical categories and descriptions for items
  • + High-quality, realistic food photography that matches the text labels
  • The 'grid' layout for photos is a bit irregular
  • Image is cropped tightly at the bottom

Qwen Image 2512

  • + Strong minimalist aesthetic with a clean grid layout
  • + Effective use of white space and bold typography
  • Contains significant gibberish text and spelling errors
  • Combines pizza and mains into one incoherent section
  • The food images are repetitive and low-detail

Verdict: GPT Image 1.5 is the clear winner as it produces a fully functional, professional-grade menu with perfect text and logically categorized items. Qwen Image 2512 follows the requested grid layout more strictly but fails significantly on text legibility and content accuracy.

Adorable Baby Animals in Sunny Meadow

Text-to-Image

“Hyper-photorealistic scene of fluffy baby animals—a golden retriever puppy, tabby kitten, baby bunny, and red fox kit—with big expressive eyes and ultra-detailed soft fur, playfully chasing butterflies and tumbling together in a lush wildflower meadow, warm golden sunrise light with god rays and dew sparkles, joyful wholesome vibe, 8K masterpiece.”

GPT Image 1.5
Qwen Image 2512
75% wins 0% ties 25% wins

AI Judge Analysis

GPT Image 1.5

  • + Captures the 'tumbling' and 'playful' aspect of the prompt much better than Model B.
  • + Excellent rendering of golden hour light, god rays, and atmospheric dew sparkles.
  • + Highly expressive and varied facial expressions on each animal.
  • Anatomical issues with the kitten, including an extra paw and a confusing orientation of limbs.
  • The butterfly's scale is a bit large compared to the animals.

Qwen Image 2512

  • + Clean and anatomically correct subjects with very clear, high-resolution textures.
  • + Well-organized composition that ensures all four animals are clearly visible and front-facing.
  • + Beautifully detailed fur and sharp focus on the faces.
  • The animals are posing for a portrait rather than 'playfully chasing and tumbling' as requested.
  • The lighting and environment feel a bit more static and less magical than the god rays in Model A.
  • Butterflies feel somewhat pasted onto the scene rather than part of the action.

Verdict: GPT Image 1.5 does a much better job of capturing the joyful, dynamic energy of animals tumbling and chasing butterflies in a magical atmosphere, though it suffers from significant AI artifacts in the kitten's anatomy. Qwen Image 2512 produces a much cleaner, higher-quality technical image with perfect anatomy, but it ignores the action-oriented parts of the prompt in favor of a static group portrait. GPT Image 1.5 is the preferred choice for its superior interpretation of the 'wholesome vibe' and interaction requested.

Vintage Cafe Logo

Text-to-Image

“Vintage minimalist restaurant logo for "Caffè Florian", retro cloche dome with steam and "Est. 1720" banner, classic typography, warm brown and cream tones, subtle texture on light background, vector emblem style.”

GPT Image 1.5
Qwen Image 2512

AI Judge Analysis

GPT Image 1.5

  • + Perfect adherence to all text requirements including 'Caffè Florian' and 'Est. 1720'.
  • + Clean, minimalist vector aesthetic that works well as a logo.
  • + Professional layout with a well-integrated banner.
  • Slightly generic steam effect compared to the illustrative quality of the other model.

Qwen Image 2512

  • + Beautiful hand-drawn illustrative style with great woodcut texture.
  • + High-quality rendering of the smoke/steam.
  • + Accurate text rendering for both terms.
  • The composition is a bit more crowded than a typical 'minimalist' logo.
  • Small artifacts in the crossbar of the 'F' in Florian.

Verdict: Both models followed the prompt exceptionally well, producing accurate text and thematic elements. GPT Image 1.5 is preferred for its cleaner, more 'minimalist' vector logo aesthetic, whereas Qwen Image 2512 leaned more into a complex illustration style that is slightly less practical for a versatile logo.

GPT Image 1.5

OpenAI's state-of-the-art image generation model with better instruction following and adherence to prompts

Qwen Image 2512

Improved version of Alibaba's Qwen image model with better text rendering, finer natural textures, and more realistic human generation.