Recraft's latest text-to-image generation model with high-quality output, supporting various aspect ratios and custom color palettes
Settled by community votes across 2 shared challenges, with an AI judge weighing in on each.
Recraft V4
#8 of 44 in Text-to-Image
Not enough comparable category data
The chart appears once both models have ratings across at least three shared arena categories.
Stable Diffusion 3.5 Medium
#41 of 44 in Text-to-Image
Where the votes landed
Recraft V4
66.7%
win rate
Ties
33.3%
Stable Diffusion 3.5 Medium
0.0%
win rate
Challenge by challenge
The strongest take from each model on every shared challenge, with the AI judge's read.
The Reversed Rodeo
Text-to-Image“Horse riding astronaut in space — horse on top, not vice versa. Surreal, highly detailed, cinematic.”
AI Judge Analysis
Recraft V4
- + Excellent cinematic lighting and atmosphere
- + High level of texture detail on the astronaut suit and horse mane
- + Dynamic composition with floating asteroids providing depth
- − Failed the specific spatial instruction; the astronaut is riding the horse, not the other way around
Stable Diffusion 3.5 Medium
- + Clear, vibrant colors with a distinct space background
- + Good rendering of the astronaut's patches and helmet visor
- − Failed the negative constraint; the astronaut is riding the horse instead of the horse being on top
- − Anatomical issues with the horse's legs appearing elongated and disjointed
- − The composition feels flat compared to Model A
Verdict: Both models failed the specific logical request to have the horse on top of the astronaut, defaulting to the common trope of an astronaut riding a horse. Recraft V4 is the winner because it provides a significantly higher quality image with cinematic lighting and realistic textures, whereas Stable Diffusion 3.5 Medium produced anatomical distortions in the horse's legs and a less compelling composition.
The Halloween Invitation
Text-to-Image“Vintage gothic Halloween party invitation. Dark parchment poster, spooky border with webs and thorns, central glowing jack-o-lantern, bats, twisted trees, moody night sky. Add elegant gothic title text saying "Halloween Party Invitation", a small scroll banner saying "You are invited to a night of frights", and event details at the bottom: Date: 30.10.2026 Time: 7pm Location: The Arches, NYC Spooky but polished, cinematic lighting, square format.”
AI Judge Analysis
Recraft V4
- + Excellent typography with perfect spelling of all requested text.
- + High-quality cinematic lighting with a detailed, atmospheric background.
- + Sophisticated cobweb border that aligns well with the gothic theme.
- − The 'banner' for the subtitle is a plain white rectangle rather than a stylized scroll.
- − The background texture feels more like an illustration than a 'dark parchment poster'.
Stable Diffusion 3.5 Medium
- + Successfully incorporates the 'dark parchment' texture as a central element.
- + Composition frames the text well with the twisted trees and pumpkins.
- − Significant spelling errors throughout all text fields.
- − Graphic design feels dated and less polished compared to the other model.
- − Lighting is flat and lacks the requested cinematic quality.
Verdict: Recraft V4 followed the complex text instructions perfectly, producing a professional-looking invitation with clear, legible details and beautiful atmospheric lighting. In contrast, Stable Diffusion 3.5 Medium struggled with the text rendering, resulting in numerous spelling mistakes and a less refined visual style.
Explore each model
Stability AI's 2.5-billion parameter Multimodal Diffusion Transformer with improvements (MMDiT-X) text-to-image model optimized for consumer hardware, featuring improved image quality, typography, and complex prompt understanding