Stable Diffusion 3.5 Large vs Wan 2.6
Head-to-head across 6 challenges
Stable Diffusion 3.5 Large
61.1%
win rate
Ties
0.0%
Wan 2.6
38.9%
win rate
Challenge Results
Modern Clean Menu
Text-to-Image“Modern minimalist restaurant menu design, white background with colorful food photos in grid, sections for appetizers/pizza/mains, bold sans-serif fonts, vibrant accents, clean professional layout for casual dining.”
AI Judge Analysis
Stable Diffusion 3.5 Large
- + Excellent photographic quality and appetizing food presentation
- + Strong implementation of the requested food grid layout
- + High-impact bold typography for the main title
- − The 'Mains' and 'Appetizers' sections are poorly structured and mostly unintelligible
- − Food grid occupies the borders rather than being integrated into the functional menu design
Wan 2.6
- + Excellent structure with clearly defined Appetizers/Pizza/Mains sections
- + Effective use of vibrant color accents as requested
- + More realistic menu pricing and itemized structure compared to Model A
- − Food photography is slightly lower in resolution and less vibrant than Model A
- − Text contains several gibberish character artifacts in the body font
Verdict: Stable Diffusion 3.5 Large produced beautiful food photography, but the layout feels more like a poster than a functional menu, with text that is largely unreadable. Wan 2.6 followed the structural instructions much better, creating an organized, three-section menu with vibrant accents and a logical grid, though his food photos were slightly less polished. Wan 2.6 is the winner for better adhering to the design requirements of a restaurant menu.
Isometric Miniature Diorama Scenes
Text-to-Image“Create a clear, 45° top-down isometric miniature 3D cartoon scene of Japan's signature dish: sushi, with soft refined textures, realistic PBR materials, gentle lighting, on a small raised diorama base with minimal garnish and plate. Solid light blue background. At top-center: 'JAPAN' in large bold text, 'SUSHI' below it, small flag icon. Perfectly centered, ultra-clean, high-clarity, square format.”
AI Judge Analysis
Stable Diffusion 3.5 Large
- + Excellent 3D toy-like aesthetic with appealing textures
- + Great variety of sushi types on the diorama
- + Good rendering of shadows and highlights
- − Failed to place text at top-center as requested
- − Text is rendered on a sign within the scene rather than overhead
- − Includes extra cluttered decorative elements contrary to the 'minimal' request
Wan 2.6
- + Followed text placement and formatting instructions perfectly
- + Captures the 'minimal' and 'clean' aesthetic requested
- + Accurate 45-degree isometric perspective and centered diorama
- − The sushi models are relatively simple compared to Model A
- − Texturing on the rice is a bit repetitive
Verdict: Wan 2.6 is the clear winner as it followed every layout instruction, including placing the specified text and flag icon at the top-center of a clean composition. Stable Diffusion 3.5 Large produced a more visually intricate 3D model, but failed the specific positioning instructions for the text, integrating it into the scene on a physical sign instead.
Victorian Greenhouse Oasis
Text-to-Image“Hyper-photorealistic interior of a lush Victorian glass greenhouse filled with exotic tropical plants, vibrant blooming orchids, tall ferns, colorful butterflies in flight, sunlight filtering through ornate glass roof creating realistic caustics and dew on leaves, intricate iron framework visible, misty atmosphere, 8K masterpiece.”
AI Judge Analysis
Stable Diffusion 3.5 Large
- + Excellent architectural scale and depth
- + Atmospheric lighting with clear god rays
- + Lush, dense greenery that feels integrated into the environment
- − The orchids are a bit oversized and look slightly artificial
- − Butterflies are less varied and some are very small/blurry
Wan 2.6
- + Superior detail on plant surfaces with visible dew drops
- + Wide variety of vibrant orchids and detailed butterfly species
- + Very intricate Victorian ironwork filigree
- − The lighting is a bit flat across the midground
- − Some butterflies appear to be 'pasted' on rather than naturally in flight
Verdict: Wan 2.6 captures the specific details of the prompt better, particularly the requested dew on leaves and the intricate Victorian ironwork. While Stable Diffusion 3.5 Large has a more cinematic sense of scale and atmosphere, Wan 2.6 provides a more vibrant and detailed 'masterpiece' look with a wider variety of flora and fauna.
Intricate Floral Mandala
Text-to-Image“Perfectly symmetrical mandala made entirely of real flowers, petals, leaves, fruits, and seeds in vibrant natural colors, intricate layered patterns with radial symmetry, top-down view on a soft neutral background, hyper-detailed organic textures and subtle shadows, photorealistic, 8K masterpiece.”
AI Judge Analysis
Stable Diffusion 3.5 Large
- + Excellent variety of elements including fruits (apples, berries, orange slices), seeds, and leaves.
- + Perfect radial symmetry and high visual appeal.
- + Strong 3D presence with consistent lighting and subtle shadows.
- − The transition between natural elements in the center feels more like a digital illustration than a physically placed arrangement.
- − The white background is a bit sterile and high-contrast.
Wan 2.6
- + Highly realistic textures that look like a physical photograph of objects on linen.
- + Includes clear seeds and dried fruits as requested.
- + Lighting is very natural with soft environmental shadows.
- − Composition is slightly less 'perfectly symmetrical' in the center arrangement.
- − Less vibrant colors compared to the other model.
Verdict: Stable Diffusion 3.5 Large creates a more vibrant and intricately designed mandala with a wide variety of fruit and floral elements, though it leans towards a digital art style. Wan 2.1 produces a far more photorealistic result that truly looks like real organic materials laid out on a soft fabric surface, capturing the 'real flowers and seeds' aspect with greater tactile authenticity. While Stable Diffusion is more visually striking, Wan 2.1 is preferred for its superior adherence to the 'photorealistic' and 'organic textures' portion of the prompt.
Vintage Cafe Logo
Text-to-Image“Vintage minimalist restaurant logo for "Caffè Florian", retro cloche dome with steam and "Est. 1720" banner, classic typography, warm brown and cream tones, subtle texture on light background, vector emblem style.”
AI Judge Analysis
Stable Diffusion 3.5 Large
- + Excellent command of vector emblem aesthetics with clean lines and balanced spacing.
- + Accurate rendering of nearly all text including 'Est. 1720' and 'Florian'.
- + Effective use of subtle parchment-like background texture.
- − Spelled the main text 'Cafféé' incorrectly by doubling the 'e'.
- − The floating cloche top and steam elements underneath create a slightly messy central silhouette.
Wan 2.6
- + Successfully rendered the complex 'Caffè' accent and overall spelling perfectly.
- + Stronger 3D shading on the cloche dome giving it more dimension.
- + Followed the request for a banner containing the establishment date more integrated with the icon.
- − The 'Est. 1720' text is slightly warped following the curve of the banner.
- − The steam iconography is a bit thin and less stylistically cohesive than the rest of the logo.
- − Lack of frame/border elements makes the composition feel more standard than vintage.
Verdict: Wan 2.6 is the winner primarily due to its perfect spelling of 'Caffè Florian', including the specific accent required. While Stable Diffusion 3.5 Large has a slightly more sophisticated vintage border and layout, its double-vowel spelling error and awkward floating cloche design make it less viable for a professional logo task.
Apollo 11: Journey to Tranquility
Text-to-Image“Create a clean, modern vector infographic poster about the Apollo 11 mission. NASA-inspired palette (navy, white, muted red, light gray). Flat-vector style, crisp lines, consistent iconography, subtle gradients only. Steps (stop at landing): 1. Launch (Saturn Vicon) 2. Earth Orbit (Earth + orbit ring icon) 3. Translunar (trajectory arc icon) 4. Lunar Orbit (Moon + orbit ring icon) 5. Descent (lunar module descending icon) 6. Landing (lunar module on the surface icon) Small supporting elements (minimal text): • Crew strip: three silhouette icons with only last names: Armstrong, Aldrin, Collins. • Landing site marker: Moon pin labeled "Tranquility" only. Layout constraints: generous margins, large readable labels, clean background with subtle stars. Vector-only, print-poster look, high resolution.”
AI Judge Analysis
Stable Diffusion 3.5 Large
- + Successfully integrated multiple infographic elements and technical diagrams.
- + Captured the requested NASA-inspired color palette and vector illustration style.
- + Includes a wide variety of space imagery including the Moon, Earth, and spacecraft.
- − The main spacecraft is a Space Shuttle, which is historically incorrect for the Apollo 11 mission.
- − The text is largely illegible gibberish.
- − The layout is cluttered and lacks the requested 6-step logical flow.
Wan 2.6
- + Excellent text rendering with clear, legible names and title.
- + Clean, minimalist composition that feels modern and professional.
- + Follows the requested color palette perfectly.
- − Completely failed to include the requested 6-step infographic content (Launch, Orbit, etc.).
- − Lacks the icons requested in the prompt, such as the Saturn V or Lunar Module.
- − Very simplistic interpretation that misses the core technical nature of the prompt.
Verdict: Stable Diffusion 3.5 Large attempted the complex technical nature of the prompt, creating a dense infographic feel, but it failed historically by showing a Space Shuttle instead of the Apollo hardware and lacked a coherent step-by-step flow. Wan 2.6 produced a very clean and aesthetically pleasing graphic with perfect text, but it ignored almost all the specific infographic steps and instructions in the prompt. Stable Diffusion 3.5 Large is the winner for better following the intent of an 'infographic poster' despite its factual errors and messy text.
Stable Diffusion 3.5 Large
Stability AI's 8.1-billion parameter Multimodal Diffusion Transformer (MMDiT) text-to-image model featuring improved image quality, typography, complex prompt understanding, and resource-efficiency
Wan 2.6
Alibaba's text-to-image generation model from the Wan AI suite, supporting both Chinese and English prompts with optional reference image guidance for style