Improved version of Alibaba's Qwen image model with better text rendering, finer natural textures, and more realistic human generation.
Settled by community votes across 8 shared challenges, with an AI judge weighing in on each.
Qwen Image 2512
#26 of 44 in Text-to-Image
Not enough comparable category data
The chart appears once both models have ratings across at least three shared arena categories.
Recraft V4
#8 of 44 in Text-to-Image
Where the votes landed
Qwen Image 2512
16.7%
win rate
Ties
0.0%
Recraft V4
83.3%
win rate
Challenge by challenge
The strongest take from each model on every shared challenge, with the AI judge's read.
Geometric Composition
Text-to-Image“A glass cube on a wooden table. Inside the cube is a small blue sphere. On top of the cube sits a red book. A green plant is behind the cube, partially visible through the glass. Soft window light from the left.”
AI Judge Analysis
Qwen Image 2512
- + Excellent adherence to lighting instructions with natural soft light from the left.
- + The glass cube has realistic reflections and refractions of the background plant.
- + The composition feels very natural and photorealistic.
- − The sphere inside appears to be floating or leaning against a side wall rather than resting on the bottom.
- − The book is slightly too large for the cube it sits on.
Recraft V4
- + The sphere is more central and features complex internal reflections.
- + High level of detail on the textures of the book pages and cover wear.
- − The sphere is levitating inside the cube, which defies physics without explanation.
- − The glass of the cube looks more like solid blocks of acrylic than a hollow glass container.
- − The lighting is somewhat flat compared to the specific 'window light' request.
Verdict: Qwen Image 2512 produces a much more convincing and photorealistic scene with superior lighting and material properties. While Recraft V4 has excellent fine detail on the book, the levitating sphere and the solid-looking glass cube make the image feel less grounded in reality.
Candid Street Photography
Text-to-Image“A candid street photo of an elderly Japanese man repairing a red bicycle in light rain, reflections on wet pavement, shallow depth of field, 50mm lens, natural skin texture, imperfect framing, motion blur from passing cars, cinematic but realistic, no stylization.”
AI Judge Analysis
Qwen Image 2512
- + Excellent skin texture and facial details
- + Accurate 50mm lens compression and bokeh
- + Subtle but effective motion blur on passing vehicles
- − The man is posing/looking at the camera rather than actively repairing the bike
- − The bike geometry is nonsensical with the handlebars and seat facing opposite directions
- − Lack of visible rain droplets compared to Model B
Recraft V4
- + Stronger adherence to the 'repairing' action and 'candid' feel
- + Superior environmental effects with visible rain and vibrant reflections
- + Effective motion blur on the right side of the frame
- − The subject's face is obscured and less detailed
- − The bicycle setup is physically impossible (no front fork, wheel just floating)
- − The background pedestrians look a bit painterly or smudged
Verdict: Qwen Image 2512 produces a much more realistic portrait with incredible skin texture, but it fails to capture the 'candid' and 'repairing' aspect of the prompt, showing a posed subject instead. Recraft V4 does a better job with the atmospheric rain and the active storytelling of the scene, though both models struggle significantly with the anatomy of the red bicycle. Qwen Image 2512 wins slightly due to its overall photographic fidelity and cleaner composition.
Modern Clean Menu
Text-to-Image“Modern minimalist restaurant menu design, white background with colorful food photos in grid, sections for appetizers/pizza/mains, bold sans-serif fonts, vibrant accents, clean professional layout for casual dining.”
AI Judge Analysis
Qwen Image 2512
- + Strong composition with a cohesive grid of food photos
- + Clean use of vibrant accent colors that match the modern aesthetic
- + Faithful adherence to the section requirements (appetizers, pizza, etc.)
- − Text consists of gibberish characters and misspellings
- − Some food items in the photos look distorted or over-rendered
Recraft V4
- + Perfect English text rendering with logical menu items and pricing
- + Very clean, minimalist layout that looks like a real professional asset
- + Individual food photography is high quality and isolated cleanly
- − Does not follow the 'photos in grid' instruction, instead using isolated png-style images
- − Lacks the vibrant accents requested, opting for a very sparse white palette
Verdict: Recraft V4 produces a much more functional design with perfectly legible text and realistic food photography, making it feel like a finished product despite missing the grid layout. Qwen Image 2512 follows the layout instructions and 'vibrant accents' better, but the text is unusable and the overall typography is messy.
Chalkboard Menu
Text-to-Image“Handwritten-style chalkboard menu in a cozy café, all text rendered in the exact same realistic chalk handwriting style with natural variations in letter size, slight slant, and chalk texture — no printed or digital fonts anywhere on the board. Title at the top in elegant cursive chalk handwriting: ‘TODAY’S SPECIALS – APRIL 30, 2026’. Below it, three menu items also in the same handwritten chalk style: ‘Truffle Mushroom Risotto – $24’, ‘Grilled Octopus with Lemon & Herbs – $28’, ‘Brown Butter Chocolate Chip Cookies – $9’. At the very bottom, smaller text in the identical handwritten chalk style (slightly smaller but still clearly legible with the same handwriting characteristics): ‘All items made fresh daily • Ask about our gluten-free options’. Warm ambient café lighting, visible chalk dust and smudges, realistic handwriting imperfections, no clean printed text anywhere.”
AI Judge Analysis
Qwen Image 2512
- + Excellent legibility with elegant cursive styling.
- + Highly realistic chalk texture with dusty smudges on the board.
- + Large, well-centered composition that fills the frame.
- − Includes a spelling error ('Risitto' instead of 'Risotto').
- − The lettering looks slightly too clean and uniform for natural handwriting.
Recraft V4
- + Highly realistic handwriting with natural variations in letter size and baseline.
- + Perfect spelling on all menu items.
- + Authentic café environment in the background enhances the 'cozy' atmosphere requested.
- − The chalk texture is a bit grainy, making the smaller text at the bottom slightly harder to read.
- − Composition is a bit cluttered with the background elements compared to the clean board focus.
Verdict: Recraft V4 is the winner as it perfectly adhered to the prompt's request for natural variations in handwriting and correctly spelled 'Risotto'. While Qwen Image 2512 produced a very beautiful and elegant board, its spelling error and overly perfect, font-like lettering made it feel less like authentic handwriting.
The Capybara Taxi Driver
Text-to-Image“Photorealistic scene inside a yellow New York taxi at night. A capybara is driving, wearing a yellow taxi driver cap and a dark jacket. It has a calm, professional expression and both front paws on the steering wheel. In the back seat sits a human businesswoman in a coat, looking at her phone with a completely normal, bored expression (as if this is just another normal ride). Through the windows you can see the streets of Manhattan at night with blurred lights. Realistic taxi interior, photorealistic, detailed fur and fabric, 35mm lens, night lighting with reflections, shallow depth of field.”
AI Judge Analysis
Qwen Image 2512
- + Excellent photorealism in various textures like fur, fabric, and skin.
- + Dynamic and engaging composition with a centered subject.
- + Clear, high-resolution rendering of the capybara's expression and accessories.
- − The passenger's expression looks slightly distressed rather than 'bored'.
- − The scale of the capybara relative to the car interior is slightly exaggerated.
Recraft V4
- + Perfectly captures the 'bored' and 'normal' atmosphere requested in the prompt.
- + Highly realistic taxi interior details including the upholstery and 'Metered Fare' sign.
- + Complex lighting effects with rain on the windows and city bokeh.
- − The capybara's head is clipping through the roof/visor area of the taxi.
- − The steering wheel appears to be on the right side, which is incorrect for a New York taxi.
Verdict: Qwen Image 2512 produces a much sharper and more cinematic image with impressive character detail, while Recraft V4 captures the specific 'mundane' narrative tone of the prompt more effectively. Recraft V4 includes better environmental storytelling (like the interior signage and rain), but suffers from technical errors like the capybara's head clipping and the right-hand drive configuration.
Isometric Miniature Diorama Scenes
Text-to-Image“Create a clear, 45° top-down isometric miniature 3D cartoon scene of Japan's signature dish: sushi, with soft refined textures, realistic PBR materials, gentle lighting, on a small raised diorama base with minimal garnish and plate. Solid light blue background. At top-center: 'JAPAN' in large bold text, 'SUSHI' below it, small flag icon. Perfectly centered, ultra-clean, high-clarity, square format.”
AI Judge Analysis
Qwen Image 2512
- + Perfectly captures the '3D cartoon miniature' aesthetic
- + Excellent typography that matches the art style
- + Clean, refined textures on the diorama base and sushi
- − The flag icon is integrated into the text line rather than being a separate element
- − Minor clipping on the wasabi shape
Recraft V4
- + Highly realistic PBR material textures on the fish
- + Accurate placement of the flag icon top-center
- + Crisp text and clean background
- − Missed the 'cartoon' style request, opting for a photorealistic look
- − The crystalline base is a bit busy compared to the requested 'small raised diorama'
- − Salmon texture on one nigiri looks slightly warped
Verdict: Qwen Image 2512 followed the stylistic prompt much better, delivering a cohesive 3D cartoon miniature with charming textures and appropriate toy-like proportions. Recraft V4 produced a photorealistic image instead of the requested cartoon style, though it excelled in realistic PBR material rendering and text placement.
Adorable Baby Animals in Sunny Meadow
Text-to-Image“Hyper-photorealistic scene of fluffy baby animals—a golden retriever puppy, tabby kitten, baby bunny, and red fox kit—with big expressive eyes and ultra-detailed soft fur, playfully chasing butterflies and tumbling together in a lush wildflower meadow, warm golden sunrise light with god rays and dew sparkles, joyful wholesome vibe, 8K masterpiece.”
AI Judge Analysis
Qwen Image 2512
- + Excellent fur texture and facial detail
- + Stronger adherence to the 'big expressive eyes' and '8K masterpiece' aesthetic
- + Clear and vibrant lighting with distinct god rays
- − Static composition that looks more like a portrait than 'tumbling and chasing'
- − Anatomy issues with the kitty's paw placement
Recraft V4
- + Successfully captures the action of 'playfully chasing' and 'tumbling'
- + Dynamic and immersive composition with butterflies in the foreground and background
- + Excellent handling of dew sparkles and environmental atmosphere
- − The fox's front right leg is unnaturally long and thin
- − Slightly less 'photorealistic' fur texture compared to Model A
Verdict: While Qwen Image 2512 produces a very high-quality portrait with superior fur detail, Recraft V4 much better interprets the narrative elements of the prompt such as 'chasing' and 'tumbling'. Recraft V4 creates a more joyful, active scene that feels like a captured moment in time rather than a staged photo.
Vintage Cafe Logo
Text-to-Image“Vintage minimalist restaurant logo for "Caffè Florian", retro cloche dome with steam and "Est. 1720" banner, classic typography, warm brown and cream tones, subtle texture on light background, vector emblem style.”
AI Judge Analysis
Qwen Image 2512
- + Excellent typography including the correct accent on 'Caffè'
- + High level of detail on the cloche and banner
- + Perfectly captures the requested 'warm brown and cream' aesthetic with a subtle texture
- − The illustration is quite busy and borders on a vintage label rather than a minimalist logo
Recraft V4
- + Successfully achieves a more 'minimalist' vector emblem style
- + Clean and professional layout
- + Interprets the cloche and steam in a simplified, modern way
- − Incorrectly used a grave accent (è) instead of the correct accent for the brand name
- − Did not include the requested 'banner' for the Est. 1720 text
- − The stem/steam on the cloche is slightly off-center
Verdict: Qwen Image 2512 produces a much more visually rich and accurate representation of the prompt, capturing the exact requested text with a beautiful hand-drawn feel. While Recraft V4 adheres better to the 'minimalist' aspect of the prompt, it fails on key requirements like the banner and correct typography for the brand name.
Explore each model
Recraft's latest text-to-image generation model with high-quality output, supporting various aspect ratios and custom color palettes