Nano Banana 2 Google Stable Diffusion 3.5 Medium Stability AI

Settled by community votes across 2 shared challenges, with an AI judge weighing in on each.

Nano Banana 2

29.0 arena score

#1 of 44 in Text-to-Image

Best Text-to-Image right now Top 2 in Image Editing

Skill signature

Not enough comparable category data

The chart appears once both models have ratings across at least three shared arena categories.

Stable Diffusion 3.5 Medium

15.7 arena score

#41 of 44 in Text-to-Image

Vote tally

Where the votes landed

Nano Banana 2

100.0%

win rate

Ties

0.0%

Stable Diffusion 3.5 Medium

0.0%

win rate

100.0% 0.0% ties 0.0%

Shared challenges 2

Challenge by challenge

The strongest take from each model on every shared challenge, with the AI judge's read.

The Reversed Rodeo

Text-to-Image

“Horse riding astronaut in space — horse on top, not vice versa. Surreal, highly detailed, cinematic.”

Nano Banana 2

Stable Diffusion 3.5 Medium

AI Judge Analysis

Nano Banana 2

+ Excellent adherence to the 'horse on top' prompt instruction
+ Rich, vibrant colors and high level of detail in both the nebula and spacesuit
+ Dynamic cinematic composition with a clear sense of movement

− The horse's front hoof intersecting the helmet is a bit messy
− Highly stylized/AI-core aesthetic might feel over-saturated to some

Stable Diffusion 3.5 Medium

+ Clean, photographic lighting on the astronaut and horse
+ High contrast between the subjects and the dark space background

− Failed the primary prompt instruction by putting the astronaut on top
− Anatomical issues with the horse's legs and hooves
− Composition is static and less cinematic than requested

Verdict: Nano Banana 2 successfully interpreted the tricky surrealist prompt by correctly placing the horse on top of the astronaut. In contrast, Stable Diffusion 3.5 Medium ignored the specific positioning instruction and produced a standard 'astronaut riding horse' image with several anatomical glitches in the horse's legs.

The Halloween Invitation

Text-to-Image

“Vintage gothic Halloween party invitation. Dark parchment poster, spooky border with webs and thorns, central glowing jack-o-lantern, bats, twisted trees, moody night sky. Add elegant gothic title text saying "Halloween Party Invitation", a small scroll banner saying "You are invited to a night of frights", and event details at the bottom: Date: 30.10.2026 Time: 7pm Location: The Arches, NYC Spooky but polished, cinematic lighting, square format.”

Nano Banana 2

Stable Diffusion 3.5 Medium

100% wins 0% ties 0% wins

AI Judge Analysis

Nano Banana 2

+ Flawless text rendering for every specific detail requested.
+ Excellent composition that integrates the artistic elements and text seamlessly.
+ Strict adherence to all prompt elements including the scroll banner, specific border, and central jack-o-lantern.

− None identified as it perfectly met the complex text and layout requirements.

Stable Diffusion 3.5 Medium

+ Captures the 'twisted trees' and 'spooky border' requirements reasonably well.
+ Good use of color and high-contrast lighting on the jack-o-lanterns.

− Failed significantly on text rendering with multiple spelling errors like 'Halloweeen' and 'Inviloween'.
− Missing the specific requested text for the banner and incorrectly formatted the date and location.
− Lacks the central jack-o-lantern required by the prompt, placing two off to the sides instead.

Verdict: Nano Banana 2 is the clear winner as it followed every instruction perfectly, specifically excelling at the difficult task of rendering long strings of custom text without a single error. Stable Diffusion 3.5 Medium struggled with the text and failed to include several key compositional elements like the central jack-o-lantern and the specific banner content.

Next steps

Explore each model

Nano Banana 2

Google

Gemini 3.1 Flash with image generation capabilities. High-efficiency image generation model with support for text rendering, reference images, search grounding, and thinking mode. The efficient counterpart to Gemini 3 Pro Image.

Vote this model in the arena

Arena profile Lumenfall catalog

Stable Diffusion 3.5 Medium

Stability AI

Stability AI's 2.5-billion parameter Multimodal Diffusion Transformer with improvements (MMDiT-X) text-to-image model optimized for consumer hardware, featuring improved image quality, typography, and complex prompt understanding

Vote this model in the arena

Arena profile Lumenfall catalog