Imagen 4.0 Ultra Generate 001 vs Qwen Image 2512
Head-to-head across 6 challenges
Imagen 4.0 Ultra Generate 001
50.0%
win rate
Ties
0.0%
Qwen Image 2512
50.0%
win rate
Challenge Results
Geometric Composition
Text-to-Image“A glass cube on a wooden table. Inside the cube is a small blue sphere. On top of the cube sits a red book. A green plant is behind the cube, partially visible through the glass. Soft window light from the left.”
AI Judge Analysis
Imagen 4.0 Ultra Generate 001
- + Excellent adherence to the 'soft window light from the left' instruction with realistic shadows.
- + Very high visual quality with detailed textures on the book and table.
- + Correct interpretation of the plant being behind the cube and visible through the glass.
- − The blue sphere is floating/hovering in the center rather than resting on the bottom of the cube.
- − The glass cube lacks a back wall, appearing more like an open frame or solid block.
Qwen Image 2512
- + The glass cube is well-defined with distinct edges and corners.
- + The blue sphere is naturally resting on the surface inside the cube.
- + Good placement of the green plant behind the cube.
- − The lighting is flat and doesn't clearly convey the requested 'from the left' directionality.
- − The reflection of the sphere inside the glass is physically inconsistent and messy.
Verdict: Both models successfully included all requested elements. Imagen 4.0 Ultra Generate 001 produced a more aesthetically pleasing image with superior lighting and crisp textures, though the sphere is floating. Qwen Image 2512 followed the physics of the scene better by placing the sphere on the floor of the cube, but the overall image quality and lighting are less impressive.
Candid Street Photography
Text-to-Image“A candid street photo of an elderly Japanese man repairing a red bicycle in light rain, reflections on wet pavement, shallow depth of field, 50mm lens, natural skin texture, imperfect framing, motion blur from passing cars, cinematic but realistic, no stylization.”
AI Judge Analysis
Imagen 4.0 Ultra Generate 001
- + Excellent skin texture and hyper-realistic facial details
- + Strong sense of action with the man actively working on the bike frame
- + High level of detail on the wet pavement and lighting reflections
- − The bike's mechanical structure is slightly nonsensical around the pedals and chain
- − The white car in the background lacks convincing motion blur, appearing more like a still car with soft focus
Qwen Image 2512
- + Superior handling of motion blur on the passing cars, creating a dynamic street feel
- + Better cinematic composition with the 50mm shallow depth of field effect
- + Subject is looking at the camera, leaning into the 'candid street photo' prompt
- − The man is posing with the bike rather than 'repairing' it as requested
- − Minor anatomical issues with the fingers on the man's right hand
Verdict: Both models captured the mood well, but Imagen 4.0 Ultra provided a much higher level of textural detail and followed the 'repairing' instruction more closely. Qwen Image 2512 excelled at the background environmental effects like motion blur and bokeh, but the subject felt more static and posed, missing the active narrative of the prompt.
Modern Clean Menu
Text-to-Image“Modern minimalist restaurant menu design, white background with colorful food photos in grid, sections for appetizers/pizza/mains, bold sans-serif fonts, vibrant accents, clean professional layout for casual dining.”
AI Judge Analysis
Imagen 4.0 Ultra Generate 001
- + Excellent crisp and legible typography for section headers.
- + Highly realistic and professional food photography with cohesive lighting.
- + Clean, professional layout that feels like a real commercial menu.
- − Small descriptive text contains gibberish characters.
- − The grid structure becomes slightly uneven at the bottom.
Qwen Image 2512
- + Good use of a vertical grid to showcase multiple items.
- + Clean white background with a clear 'Modern Restaurant' title.
- + Contains price-like digits next to menu items, improving the menu feel.
- − Heavy gibberish text throughout all item names and descriptions.
- − Spelling error in the header 'APPETIMERS' and 'MEANS'.
- − Font choice is too condensed and tall, making it harder to read than Model A.
Verdict: Imagen 4.0 Ultra produces a much more professional and aesthetically pleasing result with high-quality food photography and superior header typography. While Qwen Image 2512 includes more items in its grid, it suffers from significant spelling errors and less legible font choices. Imagen 4.0 Ultra feels like a finished design product, whereas Qwen Image 2512 feels like a rougher mockup.
Isometric Miniature Diorama Scenes
Text-to-Image“Create a clear, 45° top-down isometric miniature 3D cartoon scene of Japan's signature dish: sushi, with soft refined textures, realistic PBR materials, gentle lighting, on a small raised diorama base with minimal garnish and plate. Solid light blue background. At top-center: 'JAPAN' in large bold text, 'SUSHI' below it, small flag icon. Perfectly centered, ultra-clean, high-clarity, square format.”
AI Judge Analysis
Imagen 4.0 Ultra Generate 001
- + Clean, professional typography that perfectly follows positioning instructions.
- + Excellent adherence to the 'miniature 3D cartoon' style with soft, rounded textures.
- + Very clean diorama base and solid background as requested.
- − The 45-degree angle is slightly flatter than a true isometric perspective.
- − The ginger is a bit oversized compared to the sushi pieces.
Qwen Image 2512
- + Strong isometric perspective and composition.
- + Good rendering of the sushi materials and textures.
- + Includes more detailed garnish like the bamboo leaf and small plants.
- − The text has an unnecessary black outline and 'cartoonish' fill not explicitly requested.
- − The diorama base has slight warping on the edges and inconsistent corners.
- − The flag icon is less clean than the one in Model A.
Verdict: Imagen 4.0 Ultra is the winner because it delivered a significantly cleaner and more professional graphic. It perfectly followed the instructions for a solid light blue background and top-center text placement, whereas Qwen Image 2512 included a slightly messy diorama base and stylised text that didn't match the ultra-clean aesthetic requested.
Adorable Baby Animals in Sunny Meadow
Text-to-Image“Hyper-photorealistic scene of fluffy baby animals—a golden retriever puppy, tabby kitten, baby bunny, and red fox kit—with big expressive eyes and ultra-detailed soft fur, playfully chasing butterflies and tumbling together in a lush wildflower meadow, warm golden sunrise light with god rays and dew sparkles, joyful wholesome vibe, 8K masterpiece.”
AI Judge Analysis
Imagen 4.0 Ultra Generate 001
- + Excellent action and dynamic posing, showing animals reaching for butterflies.
- + Great adherence to the 'god rays' and 'dew sparkles' part of the prompt.
- + Clear separation of all four requested animals with distinct interactions.
- − The style leans more toward a polished 3D illustration than 'hyper-photorealistic'.
- − Butterflies appear a bit flat and illustrative compared to the animals.
Qwen Image 2512
- + Higher degree of photorealism in the fur and facial features.
- + Captures the 'tumbling together' aspect well with the animals physically touching and huddled.
- + Beautiful lighting and bokeh effect in the background.
- − The animals are static and posing for a camera rather than 'playfully chasing butterflies'.
- − The rabbit's placement is a bit cramped, making its body proportions hard to distinguish.
Verdict: Imagen 4.0 Ultra captured the energy and action of the prompt much better, showing the animals actively engaging with the butterflies in a dynamic scene. While Qwen Image 2512 achieved a more convincingly 'photorealistic' texture for the fur, it resulted in a static group portrait that ignored the requested chasing and tumbling behaviors. Imagen 4.0 Ultra is preferred for its superior composition and prompt adherence.
Vintage Cafe Logo
Text-to-Image“Vintage minimalist restaurant logo for "Caffè Florian", retro cloche dome with steam and "Est. 1720" banner, classic typography, warm brown and cream tones, subtle texture on light background, vector emblem style.”
AI Judge Analysis
Imagen 4.0 Ultra Generate 001
- + Perfect adherence to the 'minimalist' and 'vector emblem' style requirements.
- + Excellent typography with correct spelling and accent marks.
- + Clean, balanced composition suitable for actual logo use.
- − The steam is very abstracted, bordering on simple lines.
- − The 'Est. 1720' text is slightly off-center within the banner.
Qwen Image 2512
- + Impressive illustrative detail and shading on the cloche dome.
- + Dynamic and artistic representation of steam.
- + Rich vintage texture on the background.
- − Ignored the 'minimalist' part of the prompt, creating a complex illustration instead.
- − The script font for 'Caffè' is slightly cluttered and harder to read compared to Model A.
- − The 'Est. 1720' text is bumpy and lacks the clean vector finish requested.
Verdict: Imagen 4.0 Ultra follows the prompt's structural and stylistic requirements much more accurately, delivering a clean, minimalist vector-style logo with perfect spelling. Qwen Image 2512 produces a beautiful vintage illustration, but it fails the 'minimalist' and 'vector' constraints, making it less suitable for a functional logo.
Imagen 4.0 Ultra Generate 001
Google's Imagen 4.0 Ultra model offering the highest fidelity and resolution for professional-grade image generation
Qwen Image 2512
Improved version of Alibaba's Qwen image model with better text rendering, finer natural textures, and more realistic human generation.