FLUX.2 [dev] Flash vs Stable Diffusion 3.5 Large

Head-to-head across 8 challenges

FLUX.2 [dev] Flash

68.2%

win rate

Ties

9.1%

Stable Diffusion 3.5 Large

22.7%

win rate

68.2% 9.1% ties 22.7%

Challenge Results

Geometric Composition

Text-to-Image

“A glass cube on a wooden table. Inside the cube is a small blue sphere. On top of the cube sits a red book. A green plant is behind the cube, partially visible through the glass. Soft window light from the left.”

FLUX.2 [dev] Flash
Stable Diffusion 3.5 Large
50% wins 17% ties 33% wins

AI Judge Analysis

FLUX.2 [dev] Flash

  • + Perfect adherence to the spatial requirements of the prompt.
  • + Highly realistic glass textures, including thickness and slight imperfections.
  • + Balanced composition with accurate window lighting from the left.
  • The plant is slightly more to the side than directly 'behind' the cube, though still visible through it.

Stable Diffusion 3.5 Large

  • + Clean, sharp aesthetic with vibrant colors.
  • + Good rendering of the blue sphere with realistic shadows.
  • Failed the spatial prompt: the book is inside/under the cube rather than on top.
  • The glass cube lacks a bottom face, appearing more like a cover or a five-sided box.
  • The background plant is barely visible through the cube compared to the reflections.

Verdict: FLUX.2 [dev] Flash followed the complex spatial instructions perfectly, placing the sphere inside and the book correctly on top of the cube. In contrast, Stable Diffusion 3.5 Large struggled with the arrangement, placing the book at the base and omitting the bottom of the glass cube. FLUX.2 also provided a much more photorealistic result with convincing glass caustic and refraction.

Candid Street Photography

Text-to-Image

“A candid street photo of an elderly Japanese man repairing a red bicycle in light rain, reflections on wet pavement, shallow depth of field, 50mm lens, natural skin texture, imperfect framing, motion blur from passing cars, cinematic but realistic, no stylization.”

FLUX.2 [dev] Flash
Stable Diffusion 3.5 Large
100% wins 0% ties 0% wins

AI Judge Analysis

FLUX.2 [dev] Flash

  • + Excellent realism in skin texture and facial features of the elderly man.
  • + Highly detailed and accurate representation of the bicycle components.
  • + Perfectly executed motion blur on the background cars as requested.
  • The 50mm lens depth of field effect is slightly subtle in the immediate background.
  • The rain effect is very light, almost difficult to see.

Stable Diffusion 3.5 Large

  • + Strong atmospheric lighting and cinematic mood.
  • + Clearer visibility of falling rain and splashes on the ground.
  • + Good use of shallow depth of field to separate the subject from the background.
  • The bicycle's anatomy is nonsensical (no chain, no seat, disconnected frames).
  • The hands and skin texture look overly smooth and less realistic than requested.
  • The background car is sharp/static rather than having the requested motion blur.

Verdict: FLUX.2 [dev] Flash is the clear winner due to its superior anatomical accuracy and adherence to technical prompt details like motion blur and natural skin texture. While Stable Diffusion 3.5 Large creates a moodier atmosphere, its failure to generate a functional bicycle (missing seat and chain) and the lack of motion blur on the car makes it less successful as a realistic street photo.

Fantasy Warrior

Text-to-Image

“Close portrait of a battle-worn paladin in ornate engraved plate armor, hair braided with small beads, faint scars and dirt on the skin, warm torchlight reflecting off metal, shallow depth of field, bokeh sparks, lifelike eyes, highly detailed texture on leather straps and cloth underlayer.”

FLUX.2 [dev] Flash
Stable Diffusion 3.5 Large
100% wins 0% ties 0% wins

AI Judge Analysis

FLUX.2 [dev] Flash

  • + Excellent adherence to the 'beads' requirement in the braids.
  • + Very detailed texture on leather straps and cloth underlayer as requested.
  • + Superior lighting with clear torchlight reflections and orange glow on the metal.
  • The scars look a bit more like superficial fresh surface cuts rather than healed battle scars.

Stable Diffusion 3.5 Large

  • + Intricate engraving on the plate armor with high contrast.
  • + Good implementation of shallow depth of field with the background army.
  • + Dynamic facial expression and intense lifelike eyes.
  • Completely missed the 'small beads' in the hair braids.
  • Leather straps mentioned in the prompt are not featured in the composition.
  • Lighting feels a bit more like natural daylight than warm torchlight.

Verdict: FLUX.2 [dev] Flash adhered much better to the specific details of the prompt, successfully including the beads in the hair and the leather straps which Stable Diffusion 3.5 Large omitted. While Stable Diffusion 3.5 Large produced beautiful armor engravings, FLUX.2 captured the atmospheric warm torchlight and material textures more accurately.

Isometric Miniature Diorama Scenes

Text-to-Image

“Create a clear, 45° top-down isometric miniature 3D cartoon scene of Japan's signature dish: sushi, with soft refined textures, realistic PBR materials, gentle lighting, on a small raised diorama base with minimal garnish and plate. Solid light blue background. At top-center: 'JAPAN' in large bold text, 'SUSHI' below it, small flag icon. Perfectly centered, ultra-clean, high-clarity, square format.”

FLUX.2 [dev] Flash
Stable Diffusion 3.5 Large
100% wins 0% ties 0% wins

AI Judge Analysis

FLUX.2 [dev] Flash

  • + Perfect adherence to text placement and formatting instructions.
  • + Exquisite texture work, especially on the wood grain and rice grains.
  • + Completely clean, minimalist composition that looks professional and high-clarity.
  • The sushi roll cross-section looks slightly more like a 2D graphic than a 3D model compared to the nigiri.

Stable Diffusion 3.5 Large

  • + Highly creative and detailed 'miniature' world aesthetic.
  • + Beautiful stylized 3D shapes for the sushi and garnishes.
  • + Effective use of lighting to create volume and depth.
  • Failed to place text at the top-center of the image; instead, it integrated it into flags.
  • Included too much 'garnish' and extra elements, ignoring the 'minimal' instruction.
  • The flag icon is stylized but the text placement makes the overall design feel cluttered.

Verdict: FLUX.2 [dev] Flash perfectly followed the layout instructions, placing the text exactly where requested with a clean, professional finish and excellent textures. Stable Diffusion 3.5 Large created a more vibrant and detailed 3D scene, but it ignored several negative constraints (minimal garnish) and structural requirements (text at top-center).

Adorable Baby Animals in Sunny Meadow

Text-to-Image

“Hyper-photorealistic scene of fluffy baby animals—a golden retriever puppy, tabby kitten, baby bunny, and red fox kit—with big expressive eyes and ultra-detailed soft fur, playfully chasing butterflies and tumbling together in a lush wildflower meadow, warm golden sunrise light with god rays and dew sparkles, joyful wholesome vibe, 8K masterpiece.”

FLUX.2 [dev] Flash
Stable Diffusion 3.5 Large
33% wins 0% ties 67% wins

AI Judge Analysis

FLUX.2 [dev] Flash

  • + Excellent fur textures and sharp details on all animals
  • + Includes all requested animals plus an extra bunny, fitting the 'tumbling together' prompt well
  • + Beautifully rendered lighting with clear god rays and dew sparkles
  • The fox kit has a slightly cartoonish expression compared to the other animals
  • The kitten has an extra-long, slightly awkwardly placed paw

Stable Diffusion 3.5 Large

  • + Dynamic 'chasing' composition that captures the movement requested in the prompt
  • + Captures a very joyful, infectious expression on the golden retriever puppy
  • + Soft, dreamy bokeh effect that enhances the 'wholesome' vibe
  • Anatomy issues on the fox's face which appears slightly distorted
  • The kitten's ears are more like a fox or caracal than a standard tabby kitten
  • Fur texture is slightly blurred compared to the sharpness of Model A

Verdict: FLUX.2 [dev] Flash delivered a more technically proficient image with superior fur detail, lighting effects, and better adherence to the specific species requested (the tabby kitten actually looks like a tabby). Stable Diffusion 3.5 Large captured the 'chasing' action and joyful mood better, but suffered from anatomical inconsistencies and less refined textures.

Heroic Super Hero Portrait

Text-to-Image

“Hyper-photorealistic full-body portrait of a female superhero standing triumphantly on a New York skyscraper rooftop at golden sunset, wearing a classic modest superhero costume with flowing cape, chest emblem, gloves, and boots in red and blue colors, practical design, short hair, strong determined heroic expression looking into the distance, powerful confident stance with hands on hips and cape billowing dramatically in the wind, detailed urban cityscape background, warm natural sunlight with sharp shadows and fabric highlights, ultra-sharp textures on suit, hair, and concrete, 8K masterpiece, empowering family-friendly style.”

FLUX.2 [dev] Flash
Stable Diffusion 3.5 Large

AI Judge Analysis

FLUX.2 [dev] Flash

  • + Perfect adherence to the 'hands on hips' and 'triumphant' pose instructions.
  • + Highly realistic skin textures and natural lighting integration.
  • + Excellent detail in the urban background and fabric textures.
  • The emblem is very close to the recognizable Superman/Supergirl 'S', lacking a bit of original creativity.

Stable Diffusion 3.5 Large

  • + Accurately depicts the requested short hair style.
  • + Good color vibrance in the costume.
  • Failed the 'hands on hips' pose instruction entirely.
  • The character looks like a miniature figure superimposed on a city, with poor scale and perspective.
  • The facial features and hair look slightly more 'uncanny' and less photorealistic than its counterpart.

Verdict: FLUX.2 [dev] Flash followed the prompt instructions much better, specifically capturing the requested pose and powerful stance. Stable Diffusion 3.5 Large struggled with scale and perspective, making the character look out of place against the background, while also failing to follow the core pose requirement.

Intricate Floral Mandala

Text-to-Image

“Perfectly symmetrical mandala made entirely of real flowers, petals, leaves, fruits, and seeds in vibrant natural colors, intricate layered patterns with radial symmetry, top-down view on a soft neutral background, hyper-detailed organic textures and subtle shadows, photorealistic, 8K masterpiece.”

FLUX.2 [dev] Flash
Stable Diffusion 3.5 Large
86% wins 0% ties 14% wins

AI Judge Analysis

FLUX.2 [dev] Flash

  • + Exceptional realism with authentic organic textures on leaves, berries, and petals.
  • + Highly intricate layering that follows the radial symmetry perfectly.
  • + Natural lighting and subtle shadows create a convincing three-dimensional depth.
  • The background is slightly more textured than 'soft neutral', though still effective.

Stable Diffusion 3.5 Large

  • + Very clean, high-contrast composition that is visually striking.
  • + Clearly includes a variety of requested items like whole fruits and seeds.
  • + Strong radial symmetry and centered layout.
  • The central flower looks more like a digital illustration or vector than a real photograph.
  • Shadows and lighting feel artificial, giving the image a CGI or 3D-rendered look rather than photorealistic.

Verdict: FLUX.2 [dev] Flash captures the requested 'photorealistic' and 'organic textures' much more effectively, creating a mandala that looks like it was painstakingly hand-arranged and photographed. Stable Diffusion 3.5 Large produces a beautiful design, but its execution feels like a polished digital illustration rather than a photograph of real materials.

Apollo 11: Journey to Tranquility

Text-to-Image

“Create a clean, modern vector infographic poster about the Apollo 11 mission. NASA-inspired palette (navy, white, muted red, light gray). Flat-vector style, crisp lines, consistent iconography, subtle gradients only. Steps (stop at landing): 1. Launch (Saturn Vicon) 2. Earth Orbit (Earth + orbit ring icon) 3. Translunar (trajectory arc icon) 4. Lunar Orbit (Moon + orbit ring icon) 5. Descent (lunar module descending icon) 6. Landing (lunar module on the surface icon) Small supporting elements (minimal text): • Crew strip: three silhouette icons with only last names: Armstrong, Aldrin, Collins. • Landing site marker: Moon pin labeled "Tranquility" only. Layout constraints: generous margins, large readable labels, clean background with subtle stars. Vector-only, print-poster look, high resolution.”

FLUX.2 [dev] Flash
Stable Diffusion 3.5 Large
67% wins 33% ties 0% wins

AI Judge Analysis

FLUX.2 [dev] Flash

  • + Strictly followed the requested chronological steps from launch to landing.
  • + Excellent text rendering with clear, legible labels for each stage and astronaut names.
  • + Clean vector aesthetic that matches the 'modern infographic' request perfectly.
  • Some minor icon duplication/redundancy in the layout.
  • Minor spelling error in the Saturn V label ('Sataurr').

Stable Diffusion 3.5 Large

  • + Follows the requested NASA-inspired color palette effectively.
  • + Sophisticated graphic design layout with a vintage technical manual feel.
  • Failed to follow the requested 6-step chronological structure.
  • Text is completely illegible and gibberish.
  • Depicts an incorrect vehicle (Space Shuttle style) instead of the Saturn V requested.

Verdict: FLUX.2 [dev] Flash followed the complex infographic instructions perfectly, creating a logical 6-step process with readable text and accurate icons. Stable Diffusion 3.5 Large produced a visually interesting poster but failed on almost every specific prompt requirement, including the structure, the vehicle type, and text legibility.

FLUX.2 [dev] Flash

Fast distilled version of Black Forest Labs' FLUX.2 [dev] optimized for speed and cost efficiency.

Stable Diffusion 3.5 Large

Stability AI's 8.1-billion parameter Multimodal Diffusion Transformer (MMDiT) text-to-image model featuring improved image quality, typography, complex prompt understanding, and resource-efficiency