Improved version of Alibaba's Qwen image model with better text rendering, finer natural textures, and more realistic human generation.
Settled by community votes across 8 shared challenges, with an AI judge weighing in on each.
Qwen Image 2512
#26 of 44 in Text-to-Image
Not enough comparable category data
The chart appears once both models have ratings across at least three shared arena categories.
Recraft V4 Pro
#18 of 44 in Text-to-Image
Where the votes landed
Qwen Image 2512
0.0%
win rate
Ties
0.0%
Recraft V4 Pro
100.0%
win rate
Challenge by challenge
The strongest take from each model on every shared challenge, with the AI judge's read.
Geometric Composition
Text-to-Image“A glass cube on a wooden table. Inside the cube is a small blue sphere. On top of the cube sits a red book. A green plant is behind the cube, partially visible through the glass. Soft window light from the left.”
AI Judge Analysis
Qwen Image 2512
- + Excellent depiction of the plant seen through the glass surfaces
- + Natural lighting and realistic reflections on the wood
- + Accurate cube geometry
- − The blue sphere looks like a matte rubber ball rather than glass, which clashes slightly with the luxury aesthetic
Recraft V4 Pro
- + Beautiful material rendering of the glass sphere and solid crystal cube
- + High level of detail in the book texture and wood grain
- + Perfect adherence to all spatial instructions
- − The plant is more to the side than 'behind' the cube compared to Image A
Verdict: Both models followed the prompt perfectly. Qwen Image 2512 did a better job of showing the plant 'through' the glass as requested, while Recraft V4 Pro produced a more aesthetically pleasing image with superior material textures, especially on the sphere and the thick glass cube.
Candid Street Photography
Text-to-Image“A candid street photo of an elderly Japanese man repairing a red bicycle in light rain, reflections on wet pavement, shallow depth of field, 50mm lens, natural skin texture, imperfect framing, motion blur from passing cars, cinematic but realistic, no stylization.”
AI Judge Analysis
Qwen Image 2512
- + Excellent skin texture and facial details
- + Accurate depiction of a Japanese man
- + Effective shallow depth of field with realistic bokeh
- − The man is posing/looking at the camera rather than 'repairing' the bike
- − Background cars lack the requested motion blur
- − The bike geometry has some AI merging artifacts at the seat and handlebars
Recraft V4 Pro
- + Successfully captures the requested 'candid' feel and the action of repairing
- + Superior atmosphere with visible rain and pavement reflections
- + Better composition with 'imperfect framing' requested
- − Anatomy of the bike is broken (front wheel and handlebars are disconnected/misaligned)
- − Shoes and hands have some structural issues
- − The yellow cab in the background feels less specifically Japanese than Model A's street scene
Verdict: Qwen Image 2512 produces a much more realistic portrait with incredible skin texture and clear cultural accuracy, but it fails the specific 'candid' and 'repairing' aspects of the prompt by having the subject pose for the camera. Recraft V4 Pro captures the cinematic rain atmosphere, the candid action, and the framing perfectly, but suffers from significant AI artifacts in the bicycle's geometry. Qwen Image 2512 is the preferred image due to its high photographic fidelity and lack of glaring structural errors.
Modern Clean Menu
Text-to-Image“Modern minimalist restaurant menu design, white background with colorful food photos in grid, sections for appetizers/pizza/mains, bold sans-serif fonts, vibrant accents, clean professional layout for casual dining.”
AI Judge Analysis
Qwen Image 2512
- + Stronger visual grid of photos
- + Vibrant color accents throughout the layout
- + Excellent professional vertical presentation
- − Text is nonsensical and contains numerous spelling errors
- − Section titles like 'Appetiizizers' and '/Means' are broken
Recraft V4 Pro
- + Perfect legibility with accurate spelling
- + Clean minimalist design with high-quality photography
- + Clear logical grouping of requested sections
- − Layout is slightly more basic/generic than Image A
- − The 'grid' format is separated by headers rather than a unified photo block
Verdict: Recraft V4 Pro is significantly more functional as it provides a real, usable menu with perfectly rendered text and high-quality food photography. Qwen Image 2512 has a more stylish and dense layout with vibrant accents, but it fails completely on text generation, producing a design that is visually appealing but unreadable.
Chalkboard Menu
Text-to-Image“Handwritten-style chalkboard menu in a cozy café, all text rendered in the exact same realistic chalk handwriting style with natural variations in letter size, slight slant, and chalk texture — no printed or digital fonts anywhere on the board. Title at the top in elegant cursive chalk handwriting: ‘TODAY’S SPECIALS – APRIL 30, 2026’. Below it, three menu items also in the same handwritten chalk style: ‘Truffle Mushroom Risotto – $24’, ‘Grilled Octopus with Lemon & Herbs – $28’, ‘Brown Butter Chocolate Chip Cookies – $9’. At the very bottom, smaller text in the identical handwritten chalk style (slightly smaller but still clearly legible with the same handwriting characteristics): ‘All items made fresh daily • Ask about our gluten-free options’. Warm ambient café lighting, visible chalk dust and smudges, realistic handwriting imperfections, no clean printed text anywhere.”
AI Judge Analysis
Qwen Image 2512
- + Excellent adherence to the 'elegant cursive' requirement for the title.
- + Highly realistic chalk texture with dusty smudges and varying stroke pressure.
- + Included all requested menu items with very high legibility.
- − Spelling error in 'Risotto' (spelled as 'Risitto').
Recraft V4 Pro
- + Perfect spelling across all menu items.
- + Consistent visual style throughout the entire board.
- + The background environment (café entrance) is coherent and adds good context.
- − Failed to provide 'elegant cursive' for the title, using a rounded print style instead.
- − Text appears more like a digital font overlay than natural chalk handwriting.
- − Text alignment is a bit rigid and lacks the requested natural variations in slant.
Verdict: Qwen Image 2512 captured the requested aesthetic much better, providing beautiful cursive handwriting and authentic chalk textures, even though it suffered from a minor spelling mistake. Recraft V4 Pro produced cleaner text and perfect spelling but failed to follow the stylistic instructions for cursive and natural handwriting, resulting in a look that feels digital and flat.
The Capybara Taxi Driver
Text-to-Image“Photorealistic scene inside a yellow New York taxi at night. A capybara is driving, wearing a yellow taxi driver cap and a dark jacket. It has a calm, professional expression and both front paws on the steering wheel. In the back seat sits a human businesswoman in a coat, looking at her phone with a completely normal, bored expression (as if this is just another normal ride). Through the windows you can see the streets of Manhattan at night with blurred lights. Realistic taxi interior, photorealistic, detailed fur and fabric, 35mm lens, night lighting with reflections, shallow depth of field.”
AI Judge Analysis
Qwen Image 2512
- + Excellent character expressions, both for the capybara and the businesswoman.
- + High-quality texture on the capybara's fur and the leather steering wheel.
- + Clear, cinematic front-on composition that feels like a film still.
- − The capybara's paws look more like primate hands than rodent paws.
- − The character in the back seat appears more sad/pouty than 'bored'.
Recraft V4 Pro
- + Stronger sense of atmosphere with the rain on the window and city lights.
- + The capybara's paws and silhouette are more anatomically grounded.
- + Good profile perspective showing the relationship between driver and passenger.
- − The passenger's phone is glowing but doesn't cast light on her face naturally.
- − The composition is a bit tight on the left side, cutting off the front of the car.
Verdict: Both models followed the prompt exceptionally well, capturing the surreal scenario with high fidelity. Qwen Image 2512 wins slightly on character expression and comedic timing, while Recraft V4 Pro excels at creating a moody, atmospheric New York environment with realistic weather effects.
Isometric Miniature Diorama Scenes
Text-to-Image“Create a clear, 45° top-down isometric miniature 3D cartoon scene of Japan's signature dish: sushi, with soft refined textures, realistic PBR materials, gentle lighting, on a small raised diorama base with minimal garnish and plate. Solid light blue background. At top-center: 'JAPAN' in large bold text, 'SUSHI' below it, small flag icon. Perfectly centered, ultra-clean, high-clarity, square format.”
AI Judge Analysis
Qwen Image 2512
- + Excellent typography with a cohesive 'cartoon' aesthetic that fits the prompt
- + Superior 3D diorama base that feels solid and well-modeled
- + Vibrant colors and high-quality textures on the food items
- − The flag icon is slightly merged with the text block
Recraft V4 Pro
- + Features realistic materials, especially in the rice and fish textures
- + Follows the layout and centering instructions perfectly
- + Clean, minimalist aesthetic that feels professional
- − The 'diorama base' looks like a fuzzy salt/ice block rather than a structured base
- − The text is a bit plain compared to the 'large bold' request
- − The circular plate is slightly skewed in perspective
Verdict: Qwen Image 2512 captures the 'miniature 3D cartoon' and 'diorama' aspects of the prompt much more effectively, creating a cohesive piece of digital art. Recraft V4 Pro succeeds on material realism (PBR), but its base is less defined and the typography is less impactful.
Adorable Baby Animals in Sunny Meadow
Text-to-Image“Hyper-photorealistic scene of fluffy baby animals—a golden retriever puppy, tabby kitten, baby bunny, and red fox kit—with big expressive eyes and ultra-detailed soft fur, playfully chasing butterflies and tumbling together in a lush wildflower meadow, warm golden sunrise light with god rays and dew sparkles, joyful wholesome vibe, 8K masterpiece.”
AI Judge Analysis
Qwen Image 2512
- + Excellent fur texture and lighting on the animals.
- + Strong center-focused composition with high visual appeal.
- + Very vibrant colors and clear 'god rays' effects as requested.
- − The animals are static/posed rather than 'playfully chasing and tumbling'.
- − The kitten has an anatomical error with a third paw appearing between it and the bunny.
Recraft V4 Pro
- + Successfully captures the action of 'tumbling' and 'chasing' described in the prompt.
- + More realistic interaction with the environment and Meadow depth.
- + Better adherence to the dynamic 'playful' aspect of the scene.
- − The fox looks slightly disjointed from the grass.
- − The lighting is a bit more muted compared to the requested 'warm golden sunrise' in Model A.
Verdict: Qwen Image 2512 produces a more 'perfect' and vibrant portrait with superior fur detail, but it fails to capture the requested action, showing the animals sitting still. Recraft V4 Pro better follows the prompt's instruction for movement and tumbling, creating a more dynamic and natural-looking scene despite slightly less saturated colors.
Vintage Cafe Logo
Text-to-Image“Vintage minimalist restaurant logo for "Caffè Florian", retro cloche dome with steam and "Est. 1720" banner, classic typography, warm brown and cream tones, subtle texture on light background, vector emblem style.”
AI Judge Analysis
Qwen Image 2512
- + Excellent typography rendering with consistent weight and style
- + Rich, warm tonal palette that perfectly matches the 'vintage' requirement
- + Detailed and balanced illustration style with high visual appeal
- − Illustration is quite ornate, leaning more toward a classic emblem than the requested 'minimalist' style
Recraft V4 Pro
- + Successfully captures a more minimalist and clean design aesthetic
- + Accurate text rendering and placement of elements
- + Closer to a modern vector logo style
- − The steam is very minimal and less impactful compared to Model A
- − The compositional gap between the cloche and the text is slightly awkward
- − Less visual depth and texture than requested
Verdict: Qwen Image 2512 produces a much more visually rich and professional-looking vintage emblem that perfectly captures the color palette and texture requested, though it is less 'minimalist' than the prompt suggested. Recraft V4 Pro follows the minimalist constraint better but lacks the artistic polish and warmth that makes the Qwen output stand out as a restaurant logo. Qwen Image 2512 is the preferred choice for its superior aesthetic execution and cohesive design.
Explore each model
Recraft's latest image generation model at ~2048px resolution with stronger composition, refined lighting, and realistic materials for print-ready and large-scale work