Qwen Image 2512 vs Z-Image Turbo
Head-to-head across 8 challenges
Qwen Image 2512
25.0%
win rate
Ties
0.0%
Z-Image Turbo
75.0%
win rate
Challenge Results
Geometric Composition
Text-to-Image“A glass cube on a wooden table. Inside the cube is a small blue sphere. On top of the cube sits a red book. A green plant is behind the cube, partially visible through the glass. Soft window light from the left.”
AI Judge Analysis
Qwen Image 2512
- + Excellent photographic realism with high-quality textures on the book and table.
- + Accurate and complex reflections within the glass cube.
- + Strong adherence to the lighting requested with realistic highlights.
- − Internal reflections are so strong that the plant 'behind' the cube is hard to distinguish from reflections.
Z-Image Turbo
- + Perfectly follows all spatial instructions, including the plant being visible through the glass.
- + Handles the transparency of the blue sphere nicely.
- + Clean and balanced composition with realistic depth of field.
- − The glass cube edges appear slightly distorted where they meet the reflective bottom surface.
- − The blue sphere has a slightly floating or disconnected look at its base.
Verdict: Both models followed the prompt exceptionally well. Qwen Image 2512 produces a superior photographic finish with beautiful texture and light, though its reflections are very busy. Z-Image Turbo provides a clearer view of the plant behind the glass as requested, making it slightly better for spatial clarity, but Qwen Image 2512 wins on overall visual quality and professional aesthetic.
Candid Street Photography
Text-to-Image“A candid street photo of an elderly Japanese man repairing a red bicycle in light rain, reflections on wet pavement, shallow depth of field, 50mm lens, natural skin texture, imperfect framing, motion blur from passing cars, cinematic but realistic, no stylization.”
AI Judge Analysis
Qwen Image 2512
- + Excellent photographic quality with realistic skin textures and lighting.
- + Captured the requested 'candid street photo' look with imperfect framing successfully.
- + Accurate motion blur on the passing cars and beautiful wet pavement reflections.
- − The subject is looking directly at the camera, making it feel less like a 'candid' moment and more like a portrait.
- − The bicycle's rear structure is slightly messy in terms of mechanical logic.
Z-Image Turbo
- + The action of repairing or handling the bike feels more candid and natural.
- + Included the requested light rain streaks and wet ground reflections.
- − Failed to include motion blur on the cars as requested.
- − The man's feet are poorly rendered, merging into the pavement and pedals.
- − Visible AI artifacts on the car wheels and the bicycle chain area.
Verdict: Qwen Image 2512 produces a much more convincing and high-quality photograph, successfully capturing the cinematic lighting, motion blur, and skin textures requested. Z-Image Turbo captures the candid atmosphere well, but fails on the motion blur requirement and contains several significant anatomical and structural artifacts.
Modern Clean Menu
Text-to-Image“Modern minimalist restaurant menu design, white background with colorful food photos in grid, sections for appetizers/pizza/mains, bold sans-serif fonts, vibrant accents, clean professional layout for casual dining.”
AI Judge Analysis
Qwen Image 2512
- + Excellent professional layout with balanced use of color and white space.
- + Logical sectioning that mimics a real-world multi-page or large-format menu.
- + High-quality, appetizing food photography with consistent lighting.
- − Text rendering is mostly gibberish, though it maintains the look of a font.
- − The header 'RESSAGRENT' is a non-existent word.
Z-Image Turbo
- + Successfully renders legible English words like 'APPETIZERS' and 'PIZZA'.
- + Clean, high-contrast typography that is easy to read.
- + Accurate grid layout following the prompt's request for sections.
- − Awkward phrasing like 'PIZZA MANS' and 'SE TIIION'.
- − The composition feels a bit cramped with the large central text block.
- − The food images are slightly more generic compared to the professional styling in Model A.
Verdict: Qwen Image 2512 produces a more aesthetically pleasing and professional-looking menu layout with superior food photography, though its text is mostly nonsensical. Z-Image Turbo achieves better legibility for primary headers but suffers from more linguistic errors in the secondary text and a slightly less sophisticated design balance. Qwen Image 2512 is preferred for its better overall design and visual appeal which feels closer to a real high-end casual dining menu.
Chalkboard Menu
Text-to-Image“Handwritten-style chalkboard menu in a cozy café, all text rendered in the exact same realistic chalk handwriting style with natural variations in letter size, slight slant, and chalk texture — no printed or digital fonts anywhere on the board. Title at the top in elegant cursive chalk handwriting: ‘TODAY’S SPECIALS – APRIL 30, 2026’. Below it, three menu items also in the same handwritten chalk style: ‘Truffle Mushroom Risotto – $24’, ‘Grilled Octopus with Lemon & Herbs – $28’, ‘Brown Butter Chocolate Chip Cookies – $9’. At the very bottom, smaller text in the identical handwritten chalk style (slightly smaller but still clearly legible with the same handwriting characteristics): ‘All items made fresh daily • Ask about our gluten-free options’. Warm ambient café lighting, visible chalk dust and smudges, realistic handwriting imperfections, no clean printed text anywhere.”
AI Judge Analysis
Qwen Image 2512
- + Excellent chalk texture with realistic smudges and pressure variations.
- + Elegant cursive handwriting that matches the requested aesthetic.
- + Perfect spelling on all menu items including complex words like Risotto.
- − The word 'Risotto' is slightly separated from the first line, though it remains legible.
Z-Image Turbo
- + Natural and realistic handwriting style.
- + Good layout and spacing of text on the board.
- + Excellent rendering of the date and title.
- − Spelling error in the first item ('Mustroom' instead of 'Mushroom').
- − The handwriting is a bit more 'neat' and lacks the artistic cursive flair requested for the title.
Verdict: Qwen Image 2512 is the clear winner because it followed the stylistic request for elegant cursive and maintained perfect spelling throughout. Z-Image Turbo produced a high-quality image but failed on the spelling of 'Mushroom' and lacked the specific 'chalk texture' depth seen in the smudges and strokes of the Qwen image.
The Capybara Taxi Driver
Text-to-Image“Photorealistic scene inside a yellow New York taxi at night. A capybara is driving, wearing a yellow taxi driver cap and a dark jacket. It has a calm, professional expression and both front paws on the steering wheel. In the back seat sits a human businesswoman in a coat, looking at her phone with a completely normal, bored expression (as if this is just another normal ride). Through the windows you can see the streets of Manhattan at night with blurred lights. Realistic taxi interior, photorealistic, detailed fur and fabric, 35mm lens, night lighting with reflections, shallow depth of field.”
AI Judge Analysis
Qwen Image 2512
- + Excellent photorealistic texture on the capybara's fur.
- + Accurately depicts the businesswoman looking at her phone with a bored expression.
- + Strong cinematic composition through the windshield.
- − The capybara's paws look more like primate hands, which is biologically incorrect for the species.
Z-Image Turbo
- + Good depth of field and color saturation.
- + Logical placement of the steering wheel and seatbelt for a driving scene.
- + Accurate capybara facial anatomy.
- − The capybara is not holding the steering wheel with both paws as requested.
- − The businesswoman's face is slightly blurry and lacks the 'bored' detail requested.
Verdict: Qwen Image 2512 followed the prompt's specific requirements more closely, particularly the businesswoman's facial expression and the requirement for both paws on the wheel. While Z-Image Turbo has a cleaner interior layout, it fails to execute the specific driving pose and the character's emotional state.
Isometric Miniature Diorama Scenes
Text-to-Image“Create a clear, 45° top-down isometric miniature 3D cartoon scene of Japan's signature dish: sushi, with soft refined textures, realistic PBR materials, gentle lighting, on a small raised diorama base with minimal garnish and plate. Solid light blue background. At top-center: 'JAPAN' in large bold text, 'SUSHI' below it, small flag icon. Perfectly centered, ultra-clean, high-clarity, square format.”
AI Judge Analysis
Qwen Image 2512
- + Excellent adherence to the flag icon request with a correct Japanese flag.
- + Highly detailed and appealing 3D modeling of the sushi, wasabi, and ginger.
- + Perfectly executed top-down isometric perspective on a diorama base.
- − The text 'SUSHI' is slightly off-center compared to 'JAPAN'.
Z-Image Turbo
- + Clean, minimalist aesthetic that follows the 'soft refined textures' prompt well.
- + Very clear and bold typography.
- + Accurate isometric perspective.
- − Included the wrong flag (Chinese flag instead of Japanese flag).
- − The sushi composition is very basic compared to the requested variety.
- − The diorama base is very simple and lacks the 'miniature scene' feel.
Verdict: Qwen Image 2512 followed all instructions perfectly, including the specific text placement and the correct national flag. Z-Image Turbo failed a key cultural prompt by including the flag of China for a Japan-themed image and provided a much simpler scene than requested.
Adorable Baby Animals in Sunny Meadow
Text-to-Image“Hyper-photorealistic scene of fluffy baby animals—a golden retriever puppy, tabby kitten, baby bunny, and red fox kit—with big expressive eyes and ultra-detailed soft fur, playfully chasing butterflies and tumbling together in a lush wildflower meadow, warm golden sunrise light with god rays and dew sparkles, joyful wholesome vibe, 8K masterpiece.”
AI Judge Analysis
Qwen Image 2512
- + Excellent fur texture and detail across all four animals
- + Clearly defined God rays and sunset lighting
- + Sophisticated composition with high-quality rendering of flora
- − Static, posed composition fails to capture the 'playfully chasing' and 'tumbling' aspect of the prompt
- − Anatomical oddity where the kitten's paw blends into the puppy's leg
Z-Image Turbo
- + More dynamic and active composition that matches the 'playfully chasing' and 'tumbling' prompt better
- + Good adherence to the specific requested animal types in an action poses
- + Effective use of dew sparkles and golden hour lighting
- − Lower resolution and more painterly/AI-blurred backgrounds compared to model A
- − The kitten's facial features and eyes are slightly less realistic than the other animals
Verdict: While Qwen Image 2512 has superior technical rendering and ultra-fine fur detail, it produces a static portrait that ignores the requested action. Z-Image Turbo captures the energy of the 'tumbling' and 'chasing' prompt much more effectively, even if the overall image clarity is slightly lower. Z-Image Turbo is the preferred choice for following the spirit and physical description of the prompt more holistically.
Vintage Cafe Logo
Text-to-Image“Vintage minimalist restaurant logo for "Caffè Florian", retro cloche dome with steam and "Est. 1720" banner, classic typography, warm brown and cream tones, subtle texture on light background, vector emblem style.”
AI Judge Analysis
Qwen Image 2512
- + Excellent typography with correct accents and stylish cursive
- + Rich, detailed woodcut engraving style adds to the vintage feel
- + Highly effective use of the requested banner and steam elements
- − Less 'minimalist' than Model B despite the prompt keyword
- − Steam is slightly asymmetrical and bulky
Z-Image Turbo
- + Strong adherence to the 'minimalist' part of the prompt
- + Clean vector lines suitable for a modern-retro logo
- + Accurate text rendering
- − Steam effect is very underwhelming and small
- − Lacks the 'vintage texture' requested beyond a flat cream background
Verdict: Qwen Image 2512 produces a much more professional and aesthetically pleasing design that captures the 'vintage' and 'warm' atmosphere requested. While Z-Image Turbo captures the 'minimalist' keyword better, it lacks the sophistication and rich texture that makes Qwen Image 2512's interpretation look like a high-end restaurant logo.
Qwen Image 2512
Improved version of Alibaba's Qwen image model with better text rendering, finer natural textures, and more realistic human generation.
Z-Image Turbo
Tongyi-MAI's 6-billion parameter distilled text-to-image model optimized for speed, achieving high-quality generation in 8 steps or fewer with support for bilingual text rendering