GPT Image 1 Mini vs Qwen Image 2512
Head-to-head across 5 challenges
GPT Image 1 Mini
100.0%
win rate
Ties
0.0%
Qwen Image 2512
0.0%
win rate
Challenge Results
Geometric Composition
Text-to-Image“A glass cube on a wooden table. Inside the cube is a small blue sphere. On top of the cube sits a red book. A green plant is behind the cube, partially visible through the glass. Soft window light from the left.”
AI Judge Analysis
GPT Image 1 Mini
- + Excellent photorealistic texture on the book and sphere
- + Sophisticated soft lighting that feels natural
- + Very clean composition and high resolution
- − The sphere appears to be floating inside the cube rather than resting on the bottom
- − The plant is behind the cube but doesn't show significant refraction or visibility through the glass panes themselves
Qwen Image 2512
- + Better adherence to the 'visible through the glass' instruction with clear refraction of the plant
- + The sphere correctly rests on the bottom surface of the cube
- + Includes a clear window in the background to justify the lighting
- − The cube's physics are slightly confusing, appearing more like a mirror box in the reflections
- − Visual quality is slightly lower and grainier compared to Model A
Verdict: Both models followed the prompt instructions perfectly. GPT Image 1 Mini produced a more aesthetically pleasing, high-quality image with better textures, though the sphere appears to be levitating. Qwen Image 2512 followed the spatial instructions more literally, showing the plant through the glass and placing the sphere on the floor of the cube, but the rendering is less polished.
Candid Street Photography
Text-to-Image“A candid street photo of an elderly Japanese man repairing a red bicycle in light rain, reflections on wet pavement, shallow depth of field, 50mm lens, natural skin texture, imperfect framing, motion blur from passing cars, cinematic but realistic, no stylization.”
AI Judge Analysis
GPT Image 1 Mini
- + Excellent skin texture and realistic facial details
- + Strong atmosphere with convincing water droplets and wet pavement
- + Captures the 'repairing' action more authentically
- − Anatomical issues with the man's lower body and hand merging with the wheel
- − The bicycle geometry is distorted and physically impossible in several places
Qwen Image 2512
- + Stronger adherence to the 'motion blur from passing cars' prompt
- + Bicycle structure is more coherent and recognizable
- + Better execution of the 'shallow depth of field' and bokeh
- − The man is posing for the camera rather than being 'candid' or 'repairing' the bike
- − Hands are poorly rendered with merged fingers
- − The bicycle seat and its attachment are physically nonsensical
Verdict: Both models struggle with the complex anatomy of the bicycle and the man's interaction with it. While GPT Image 1 Mini has superior skin textures and feels more like a candid moment of repair, Qwen Image 2512 better captures the specific technical requirements for motion blur and background depth, though it fails the 'candid' and 'repairing' aspect by having the subject look directly at the lens. GPT Image 1 Mini is slightly preferred for its more convincing cinematic atmosphere and subject matter adherence.
Chalkboard Menu
Text-to-Image“Handwritten-style chalkboard menu in a cozy café, all text rendered in the exact same realistic chalk handwriting style with natural variations in letter size, slight slant, and chalk texture — no printed or digital fonts anywhere on the board. Title at the top in elegant cursive chalk handwriting: ‘TODAY’S SPECIALS – APRIL 30, 2026’. Below it, three menu items also in the same handwritten chalk style: ‘Truffle Mushroom Risotto – $24’, ‘Grilled Octopus with Lemon & Herbs – $28’, ‘Brown Butter Chocolate Chip Cookies – $9’. At the very bottom, smaller text in the identical handwritten chalk style (slightly smaller but still clearly legible with the same handwriting characteristics): ‘All items made fresh daily • Ask about our gluten-free options’. Warm ambient café lighting, visible chalk dust and smudges, realistic handwriting imperfections, no clean printed text anywhere.”
AI Judge Analysis
GPT Image 1 Mini
- + Excellent chalk texture within the letters
- + Perfect spelling throughout all menu items
- + Clean and legible layout
- − Failed the request for 'elegant cursive' in the title, using block letters instead
- − The handwriting looks somewhat like a digital font due to being too uniform
Qwen Image 2512
- + Successfully followed the instruction for elegant cursive title and cursive menu items
- + Greater variety in line weight and stroke mimics real chalk better
- + Included a more realistic café background context
- − Spelling error in 'Risitto' instead of 'Risotto'
- − The 'T' in 'TODAY'S' is slightly disconnected and stylized oddly
Verdict: Qwen Image 2512 followed the stylistic instructions much more closely, providing the requested elegant cursive handwriting whereas GPT Image 1 Mini used block letters for the title. While GPT Image 1 Mini had perfect spelling, Qwen Image 2512 captured the authentic 'handwritten' aesthetic and cafe atmosphere better despite a minor typo in 'Risotto'.
The Capybara Taxi Driver
Text-to-Image“Photorealistic scene inside a yellow New York taxi at night. A capybara is driving, wearing a yellow taxi driver cap and a dark jacket. It has a calm, professional expression and both front paws on the steering wheel. In the back seat sits a human businesswoman in a coat, looking at her phone with a completely normal, bored expression (as if this is just another normal ride). Through the windows you can see the streets of Manhattan at night with blurred lights. Realistic taxi interior, photorealistic, detailed fur and fabric, 35mm lens, night lighting with reflections, shallow depth of field.”
AI Judge Analysis
GPT Image 1 Mini
- + Excellent shallow depth of field and soft lighting that creates a cinematic nocturnal atmosphere.
- + Seamless integration of the capybara's fur with the clothing and hat.
- + The woman's expression perfectly captures the 'bored' instruction from the prompt.
- − Only one paw is clearly visible on the steering wheel, whereas the prompt asked for both.
- − The composition is quite tight, making it slightly harder to see the taxi exterior context.
Qwen Image 2512
- + Successfully places both paws on the steering wheel as requested.
- + The wide-angle perspective through the windshield provides a better view of the taxi's identity and the street.
- + High level of detail on the capybara's facial features and the driver cap.
- − The woman's expression looks more like an exaggerated frown rather than a 'completely normal, bored' look.
- − The paws have a slightly distorted, hand-like appearance that looks less natural for a capybara.
- − The lighting is a bit flat compared to the moody, realistic shadows in the other image.
Verdict: GPT Image 1 Mini produces a much more photorealistic and atmospheric image with superior lighting and a more accurate human expression. While Qwen Image 2512 followed the technical instruction of having both paws on the wheel, the overall execution in GPT Image 1 Mini feels more like a professional film still and captures the 'normalcy' of the bizarre situation better.
Adorable Baby Animals in Sunny Meadow
Text-to-Image“Hyper-photorealistic scene of fluffy baby animals—a golden retriever puppy, tabby kitten, baby bunny, and red fox kit—with big expressive eyes and ultra-detailed soft fur, playfully chasing butterflies and tumbling together in a lush wildflower meadow, warm golden sunrise light with god rays and dew sparkles, joyful wholesome vibe, 8K masterpiece.”
AI Judge Analysis
GPT Image 1 Mini
- + Excellent sense of motion and 'tumbling' as requested by the prompt.
- + Superior lighting effects with soft god rays and naturalistic warm tones.
- + Dynamic composition that creates a playful, energetic scene.
- − The cat's anatomy is slightly awkward in its mid-air pose.
- − Butterflies are a bit simplistic in design.
Qwen Image 2512
- + Extremely detailed fur textures and sharp ocular highlights.
- + Excellent butterfly anatomical detail and varied positioning.
- + Perfect centered symmetry and high-resolution clarity.
- − Static composition that does not capture the 'chasing' or 'tumbling' action requested.
- − The puppy's paws overlapping the other animals looks slightly AI-mushed and unrealistic.
- − The fox's facial structure looks a bit too much like a domestic dog.
Verdict: GPT Image 1 Mini captured the spirit of the prompt much better by depicting the animals in motion ('tumbling' and 'chasing') within a beautifully lit atmosphere. While Qwen Image 2512 has higher technical sharpness and better butterfly details, its static 'posed' composition ignores the active verbs in the prompt, resulting in a more generic family-portrait style image.
GPT Image 1 Mini
OpenAI's cost-effective image generation model for when image quality isn't the top priority
Qwen Image 2512
Improved version of Alibaba's Qwen image model with better text rendering, finer natural textures, and more realistic human generation.