GPT Image 1 Mini vs Qwen Image 2512

Geometric Composition

Text-to-Image

“A glass cube on a wooden table. Inside the cube is a small blue sphere. On top of the cube sits a red book. A green plant is behind the cube, partially visible through the glass. Soft window light from the left.”

GPT Image 1 Mini

Qwen Image 2512

AI Judge Analysis

GPT Image 1 Mini

+ Excellent photorealistic texture on the book and sphere
+ Sophisticated soft lighting that feels natural
+ Very clean composition and high resolution

− The sphere appears to be floating inside the cube rather than resting on the bottom
− The plant is behind the cube but doesn't show significant refraction or visibility through the glass panes themselves

Qwen Image 2512

+ Better adherence to the 'visible through the glass' instruction with clear refraction of the plant
+ The sphere correctly rests on the bottom surface of the cube
+ Includes a clear window in the background to justify the lighting

− The cube's physics are slightly confusing, appearing more like a mirror box in the reflections
− Visual quality is slightly lower and grainier compared to Model A

Verdict: Both models followed the prompt instructions perfectly. GPT Image 1 Mini produced a more aesthetically pleasing, high-quality image with better textures, though the sphere appears to be levitating. Qwen Image 2512 followed the spatial instructions more literally, showing the plant through the glass and placing the sphere on the floor of the cube, but the rendering is less polished.

Candid Street Photography

Text-to-Image

“A candid street photo of an elderly Japanese man repairing a red bicycle in light rain, reflections on wet pavement, shallow depth of field, 50mm lens, natural skin texture, imperfect framing, motion blur from passing cars, cinematic but realistic, no stylization.”

GPT Image 1 Mini

Qwen Image 2512

100% wins 0% ties 0% wins

AI Judge Analysis

GPT Image 1 Mini

+ Excellent skin texture and realistic facial details
+ Strong atmosphere with convincing water droplets and wet pavement
+ Captures the 'repairing' action more authentically

− Anatomical issues with the man's lower body and hand merging with the wheel
− The bicycle geometry is distorted and physically impossible in several places

Qwen Image 2512

+ Stronger adherence to the 'motion blur from passing cars' prompt
+ Bicycle structure is more coherent and recognizable
+ Better execution of the 'shallow depth of field' and bokeh

− The man is posing for the camera rather than being 'candid' or 'repairing' the bike
− Hands are poorly rendered with merged fingers
− The bicycle seat and its attachment are physically nonsensical

Verdict: Both models struggle with the complex anatomy of the bicycle and the man's interaction with it. While GPT Image 1 Mini has superior skin textures and feels more like a candid moment of repair, Qwen Image 2512 better captures the specific technical requirements for motion blur and background depth, though it fails the 'candid' and 'repairing' aspect by having the subject look directly at the lens. GPT Image 1 Mini is slightly preferred for its more convincing cinematic atmosphere and subject matter adherence.

Chalkboard Menu

Text-to-Image

“Handwritten-style chalkboard menu in a cozy café, all text rendered in the exact same realistic chalk handwriting style with natural variations in letter size, slight slant, and chalk texture — no printed or digital fonts anywhere on the board. Title at the top in elegant cursive chalk handwriting: ‘TODAY’S SPECIALS – APRIL 30, 2026’. Below it, three menu items also in the same handwritten chalk style: ‘Truffle Mushroom Risotto – $24’, ‘Grilled Octopus with Lemon & Herbs – $28’, ‘Brown Butter Chocolate Chip Cookies – $9’. At the very bottom, smaller text in the identical handwritten chalk style (slightly smaller but still clearly legible with the same handwriting characteristics): ‘All items made fresh daily • Ask about our gluten-free options’. Warm ambient café lighting, visible chalk dust and smudges, realistic handwriting imperfections, no clean printed text anywhere.”

GPT Image 1 Mini

Qwen Image 2512

AI Judge Analysis

GPT Image 1 Mini

+ Excellent chalk texture within the letters
+ Perfect spelling throughout all menu items
+ Clean and legible layout

− Failed the request for 'elegant cursive' in the title, using block letters instead
− The handwriting looks somewhat like a digital font due to being too uniform

Qwen Image 2512

+ Successfully followed the instruction for elegant cursive title and cursive menu items
+ Greater variety in line weight and stroke mimics real chalk better
+ Included a more realistic café background context

− Spelling error in 'Risitto' instead of 'Risotto'
− The 'T' in 'TODAY'S' is slightly disconnected and stylized oddly

Verdict: Qwen Image 2512 followed the stylistic instructions much more closely, providing the requested elegant cursive handwriting whereas GPT Image 1 Mini used block letters for the title. While GPT Image 1 Mini had perfect spelling, Qwen Image 2512 captured the authentic 'handwritten' aesthetic and cafe atmosphere better despite a minor typo in 'Risotto'.

The Capybara Taxi Driver

Text-to-Image

“Photorealistic scene inside a yellow New York taxi at night. A capybara is driving, wearing a yellow taxi driver cap and a dark jacket. It has a calm, professional expression and both front paws on the steering wheel. In the back seat sits a human businesswoman in a coat, looking at her phone with a completely normal, bored expression (as if this is just another normal ride). Through the windows you can see the streets of Manhattan at night with blurred lights. Realistic taxi interior, photorealistic, detailed fur and fabric, 35mm lens, night lighting with reflections, shallow depth of field.”

GPT Image 1 Mini

Qwen Image 2512

AI Judge Analysis

GPT Image 1 Mini

+ Excellent shallow depth of field and soft lighting that creates a cinematic nocturnal atmosphere.
+ Seamless integration of the capybara's fur with the clothing and hat.
+ The woman's expression perfectly captures the 'bored' instruction from the prompt.

− Only one paw is clearly visible on the steering wheel, whereas the prompt asked for both.
− The composition is quite tight, making it slightly harder to see the taxi exterior context.

Qwen Image 2512

+ Successfully places both paws on the steering wheel as requested.
+ The wide-angle perspective through the windshield provides a better view of the taxi's identity and the street.
+ High level of detail on the capybara's facial features and the driver cap.

− The woman's expression looks more like an exaggerated frown rather than a 'completely normal, bored' look.
− The paws have a slightly distorted, hand-like appearance that looks less natural for a capybara.
− The lighting is a bit flat compared to the moody, realistic shadows in the other image.

Verdict: GPT Image 1 Mini produces a much more photorealistic and atmospheric image with superior lighting and a more accurate human expression. While Qwen Image 2512 followed the technical instruction of having both paws on the wheel, the overall execution in GPT Image 1 Mini feels more like a professional film still and captures the 'normalcy' of the bizarre situation better.

Adorable Baby Animals in Sunny Meadow

Text-to-Image

“Hyper-photorealistic scene of fluffy baby animals—a golden retriever puppy, tabby kitten, baby bunny, and red fox kit—with big expressive eyes and ultra-detailed soft fur, playfully chasing butterflies and tumbling together in a lush wildflower meadow, warm golden sunrise light with god rays and dew sparkles, joyful wholesome vibe, 8K masterpiece.”

GPT Image 1 Mini

Qwen Image 2512

100% wins 0% ties 0% wins

AI Judge Analysis

GPT Image 1 Mini

+ Excellent sense of motion and 'tumbling' as requested by the prompt.
+ Superior lighting effects with soft god rays and naturalistic warm tones.
+ Dynamic composition that creates a playful, energetic scene.

− The cat's anatomy is slightly awkward in its mid-air pose.
− Butterflies are a bit simplistic in design.

Qwen Image 2512

+ Extremely detailed fur textures and sharp ocular highlights.
+ Excellent butterfly anatomical detail and varied positioning.
+ Perfect centered symmetry and high-resolution clarity.

− Static composition that does not capture the 'chasing' or 'tumbling' action requested.
− The puppy's paws overlapping the other animals looks slightly AI-mushed and unrealistic.
− The fox's facial structure looks a bit too much like a domestic dog.

Verdict: GPT Image 1 Mini captured the spirit of the prompt much better by depicting the animals in motion ('tumbling' and 'chasing') within a beautifully lit atmosphere. While Qwen Image 2512 has higher technical sharpness and better butterfly details, its static 'posed' composition ignores the active verbs in the prompt, resulting in a more generic family-portrait style image.

Challenge Results

Geometric Composition

AI Judge Analysis

Candid Street Photography

AI Judge Analysis

Chalkboard Menu

AI Judge Analysis

The Capybara Taxi Driver

AI Judge Analysis

Adorable Baby Animals in Sunny Meadow

AI Judge Analysis

GPT Image 1 Mini

Qwen Image 2512