Head to head
Esc

Models · slot A

to navigate to pick

FLUX.1 Kontext [dev] Black Forest Labs Grok Imagine Image xAI

Settled by community votes across 1 shared challenge, with an AI judge weighing in on each.

FLUX.1 Kontext [dev]

13.5 arena score

#43 of 44 in Text-to-Image

Skill signature

Not enough comparable category data

The chart appears once both models have ratings across at least three shared arena categories.

Grok Imagine Image

24.1 arena score

#19 of 44 in Text-to-Image

Vote tally

Where the votes landed

FLUX.1 Kontext [dev]

0.0%

win rate

Ties

0.0%

Grok Imagine Image

100.0%

win rate

0.0% 0.0% ties 100.0%
Shared challenges 1

Challenge by challenge

The strongest take from each model on every shared challenge, with the AI judge's read.

Magic Burger Explosion: Fiery Photorealism Challenge

Text-to-Image

“Ad for 'Magic Burger'. Dynamic, exploded burger with all components (bun, patty, cheese, lettuce, tomato, sauce) suspended in mid-air. Emphasize photorealistic detail and a sense of motion. Dark, fiery background with glowing embers. Integrate text: 'MAGIC BURGER' as a prominent title, 'LIMITED TIME ONLY' as a secondary message, and '€6.99' in a starburst, all rendered with a fiery, glowing effect.”

FLUX.1 Kontext [dev]
Grok Imagine Image
0% wins 0% ties 100% wins

AI Judge Analysis

FLUX.1 Kontext [dev]

  • + Excellent text legibility and clean graphic design
  • + High-quality rendering of the charcoal and flame base
  • Completely failed the 'exploded burger' requirement; the burger is fully assembled
  • Spelling error in 'ONLY' (rendered as 'LNHLY')

Grok Imagine Image

  • + Perfectly captured the 'exploded' effect with suspended components
  • + All text is spelled correctly and features the requested fiery glow
  • + Strong sense of motion with sauce splashes and flying ingredients
  • Composition is slightly cluttered with sauce droplets
  • The starburst design is a bit generic compared to the high-detail background

Verdict: Grok Imagine Image is the clear winner as it followed the complex 'exploded burger' instruction perfectly, whereas FLUX.1 Kontext [dev] generated a standard assembled burger. Grok also maintained perfect spelling across all text elements, while FLUX.1 had a typo in the secondary message.

Next steps

Explore each model