FLUX.2 [flex] Black Forest Labs Grok Imagine Image xAI

Settled by community votes across 8 shared challenges, with an AI judge weighing in on each.

FLUX.2 [flex]

25.2 arena score

#13 of 44 in Text-to-Image

Skill signature

Not enough comparable category data

The chart appears once both models have ratings across at least three shared arena categories.

Grok Imagine Image

24.1 arena score

#19 of 44 in Text-to-Image

Vote tally

Where the votes landed

FLUX.2 [flex]

80.0%

win rate

Ties

0.0%

Grok Imagine Image

20.0%

win rate

80.0% 0.0% ties 20.0%

Shared challenges 8

Challenge by challenge

The strongest take from each model on every shared challenge, with the AI judge's read.

Geometric Composition

Text-to-Image

“A glass cube on a wooden table. Inside the cube is a small blue sphere. On top of the cube sits a red book. A green plant is behind the cube, partially visible through the glass. Soft window light from the left.”

FLUX.2 [flex]

Grok Imagine Image

AI Judge Analysis

FLUX.2 [flex]

+ Perfectly adheres to all spatial prompts including object placement.
+ Excellent rendering of light and soft shadows.
+ Clean, modern aesthetic with high clarity.

− The sphere is quite large, pushing the definition of 'small' in the prompt.

Grok Imagine Image

+ Captures the 'small' aspect of the blue sphere more accurately.
+ Highly realistic wood texture and light scattering in the glass.
+ Natural-looking plant and depth of field.

− The glass object is a rectangular prism rather than a cube.
− The sphere appears to be floating unnaturally in the center without support.

Verdict: Both models followed the complex spatial instructions well. FLUX.2 [flex] produced a better 'cube' and superior lighting, while Grok Imagine Image provided a more realistic texture on the wooden table and better followed the 'small' descriptor for the sphere. FLUX.2 [flex] is the winner for better geometric accuracy regarding the cube and more cohesive composition.

Man and Car in California

Editing

Edit instruction

“Make a photo of the man driving the car down the California coastline”

Source

FLUX.2 [flex]

Grok Imagine Image

100% wins 0% ties 0% wins

AI Judge Analysis

FLUX.2 [flex]

+ Excellent preservation of the specific man's identity, including his hairstyle and clothing (visible plaid scarf).
+ High accuracy in preserving the specific car model (Rolls-Royce Phantom Drophead Coupé) from the source image.
+ Realistic motion blur on the road and wheels that enhances the sense of driving.

− The man's scale and positioning in the driver's seat feel slightly off, appearing a bit small for the car.
− The steering wheel placement is slightly detached from the driver's grip.

Grok Imagine Image

+ Beautifully rendered California coastline background with great atmospheric perspective.
+ The composition of the car on the road feels very dynamic and professional.

− Fails to use the man from the source image, replacing him with a generic older white man.
− Changes the car model slightly, evolving the classic Phantom front into a more modern Rolls-Royce Dawn style.
− Does not follow the multi-image prompt requirement to combine both source images.

Verdict: FLUX.2 [flex] is the clear winner because it successfully followed the core instruction: combining the specific man and the specific car from the source images into the requested new setting. While Grok Imagine Image produced a high-quality visual, it completely ignored the subject's identity, replacing him with a random character, which defeats the purpose of an image editing task.

Candid Street Photography

Text-to-Image

“A candid street photo of an elderly Japanese man repairing a red bicycle in light rain, reflections on wet pavement, shallow depth of field, 50mm lens, natural skin texture, imperfect framing, motion blur from passing cars, cinematic but realistic, no stylization.”

FLUX.2 [flex]

Grok Imagine Image

AI Judge Analysis

FLUX.2 [flex]

+ Excellent adherence to technical photography prompts like 50mm shallow depth of field.
+ Detailed skin textures and realistic rain/wet pavement interaction.
+ Dynamic composition with effective use of bokeh and light.

− The bicycle frame geometry is slightly warped/nonsensical near the crank set.
− The scale of the bicycle seems a bit small relative to the man.

Grok Imagine Image

+ Achieves a highly authentic 'street photography' look with realistic motion blur from cars.
+ The bicycle design is more structurally plausible for a real-world bike.
+ Captures the 'imperfect framing' prompt well with its candid feel.

− The subject's face is obscured and partially covered by a mask, losing the 'elderly Japanese man' facial detail requested.
− Overall lighting is a bit flatter and less 'cinematic' than technically requested.

Verdict: FLUX.2 [flex] produced a more visually striking and detailed image that followed the technical lighting and texture prompts more closely, though the bicycle's anatomy is slightly glitched. Grok Imagine Image captured the candid 'street photo' vibe and the motion blur of passing cars much more realistically, but it failed to showcase the facial details of the subject. FLUX.2 [flex] is the winner for its superior clarity and beautiful rendering of light and rain.

Modern Clean Menu

Text-to-Image

“Modern minimalist restaurant menu design, white background with colorful food photos in grid, sections for appetizers/pizza/mains, bold sans-serif fonts, vibrant accents, clean professional layout for casual dining.”

FLUX.2 [flex]

Grok Imagine Image

0% wins 0% ties 100% wins

AI Judge Analysis

FLUX.2 [flex]

+ Strict adherence to the 3x2 grid layout for food photos.
+ Clean, professional typography that is highly legible.
+ Excellent use of color-coded headers for organization.

− Text content is mostly gibberish.
− Limited variety in food images, with some repetition.

Grok Imagine Image

+ High accuracy in text rendering, including recognizable dish names like 'Bruschetta' and 'Margherita'.
+ Dynamic and visually appealing layout with organic placement of food photos.
+ Impressive variety in the types of food depicted.

− Failed to follow the 'grid' requirement for photo placement.
− Contains several duplicate item entries (e.g., multiple 'Steak Frites' and 'Grilled Salmon').

Verdict: FLUX.2 [flex] adhered much better to the specific layout requirements, providing a clean grid and clear sectioning, though the text is nonsensical. Grok Imagine Image produced much more legible and accurate text for a menu, but failed to follow the grid layout instruction and had several repetitive list entries.

Isometric Miniature Diorama Scenes

Text-to-Image

“Create a clear, 45° top-down isometric miniature 3D cartoon scene of Japan's signature dish: sushi, with soft refined textures, realistic PBR materials, gentle lighting, on a small raised diorama base with minimal garnish and plate. Solid light blue background. At top-center: 'JAPAN' in large bold text, 'SUSHI' below it, small flag icon. Perfectly centered, ultra-clean, high-clarity, square format.”

FLUX.2 [flex]

Grok Imagine Image

50% wins 0% ties 50% wins

AI Judge Analysis

FLUX.2 [flex]

+ Clean, professional typography and layout.
+ Excellent miniature 3D aesthetic with soft, clay-like textures.
+ Accurate isometric perspective and centered composition.

− The flag is placed below the text, whereas the prompt suggested text at top-center and sushi below it (though the layout is still very pleasing).

Grok Imagine Image

+ Good adherence to the isometric diorama style.
+ Correct inclusion of all requested elements (flag, text, sushi).
+ High visual clarity with sharp shadows.

− The text rendering is slightly less refined than Model A.
− Lighting is a bit harsh compared to the 'gentle lighting' requested.

Verdict: Both models followed the prompt exceptionally well, capturing the isometric miniature style. FLUX.2 [flex] produced a more aesthetically pleasing image with superior 'soft refined textures' and better typography, while Grok Imagine Image provided a more complex sushi plate that also accurately followed the design requirements. FLUX.2 [flex] is the winner for its more professional, clean, and cohesive 3D render look.

Adorable Baby Animals in Sunny Meadow

Text-to-Image

“Hyper-photorealistic scene of fluffy baby animals—a golden retriever puppy, tabby kitten, baby bunny, and red fox kit—with big expressive eyes and ultra-detailed soft fur, playfully chasing butterflies and tumbling together in a lush wildflower meadow, warm golden sunrise light with god rays and dew sparkles, joyful wholesome vibe, 8K masterpiece.”

FLUX.2 [flex]

Grok Imagine Image

100% wins 0% ties 0% wins

AI Judge Analysis

FLUX.2 [flex]

+ Excellent adherence to the 'chasing butterflies' and 'tumbling' motion aspects of the prompt.
+ Highly realistic fur textures and anatomical proportions for all four animals.
+ Beautifully rendered god rays and dew sparkles that feel integrated into the scene.

− The fox kit has slightly unusual dark legs that look more like black paws than a typical fox kit's markings.

Grok Imagine Image

+ Warm, vibrant color palette with strong backlighting.
+ Cute, stylized 'expressive eyes' as requested.

− The animals are static and posing rather than 'playfully chasing and tumbling' as requested.
− The butterfly rendering is poor, appearing as small white blobs rather than detailed butterflies.
− The fur texture looks overly smooth and 'AI-processed' compared to the 8K masterpiece request.

Verdict: FLUX.2 [flex] is the clear winner as it successfully captures the dynamic action of the animals chasing butterflies in a realistic meadow, whereas Grok Imagine produced a static, posed shot. FLUX.2 [flex] also delivered much higher detail in the fur, background elements, and the butterflies themselves, whereas Grok Imagine struggled with the butterfly details and overall realism.

Vintage Cafe Logo

Text-to-Image

“Vintage minimalist restaurant logo for "Caffè Florian", retro cloche dome with steam and "Est. 1720" banner, classic typography, warm brown and cream tones, subtle texture on light background, vector emblem style.”

FLUX.2 [flex]

Grok Imagine Image

AI Judge Analysis

FLUX.2 [flex]

+ Perfect adherence to the 'vintage minimalist' and 'vector emblem' descriptors.
+ Clean, professional typography that captures a classic Italian cafe aesthetic.
+ Excellent layout balance with the arched text and banner.

− The texture on the background is very subtle, almost unnoticeable.
− The steam icons are slightly thin compared to the rest of the stroke weights.

Grok Imagine Image

+ Good use of color depth and shading within the cloche icon.
+ Includes a nice paper-like texture on the light background.
+ Clear, legible text rendering.

− Redundant text repeating 'Est. 1720' twice in the layout.
− The cloche icon has nonsensical additions that look like a spoon and a handle merging into the dome.
− Less 'minimalist' than requested, feeling more like a modern mascot logo.

Verdict: FLUX.2 [flex] is the clear winner as it perfectly captures the 'minimalist' and 'vector emblem' style requested, producing a clean and professional logo. Grok Imagine Image fails on the minimalist aspect and introduces visual incoherence with strange artifacts protruding from the cloche, as well as repeating the establishment date unnecessarily.

Apollo 11: Journey to Tranquility

Text-to-Image

“Create a clean, modern vector infographic poster about the Apollo 11 mission. NASA-inspired palette (navy, white, muted red, light gray). Flat-vector style, crisp lines, consistent iconography, subtle gradients only. Steps (stop at landing): 1. Launch (Saturn Vicon) 2. Earth Orbit (Earth + orbit ring icon) 3. Translunar (trajectory arc icon) 4. Lunar Orbit (Moon + orbit ring icon) 5. Descent (lunar module descending icon) 6. Landing (lunar module on the surface icon) Small supporting elements (minimal text): • Crew strip: three silhouette icons with only last names: Armstrong, Aldrin, Collins. • Landing site marker: Moon pin labeled "Tranquility" only. Layout constraints: generous margins, large readable labels, clean background with subtle stars. Vector-only, print-poster look, high resolution.”

FLUX.2 [flex]

Grok Imagine Image

100% wins 0% ties 0% wins

AI Judge Analysis

FLUX.2 [flex]

+ Excellent layout with clean, professional vector aesthetic
+ High text legibility and mostly correct spelling
+ Sophisticated iconography that feels like a real educational poster

− Missed the final 'Landing' step from the specific prompt list
− The order of steps (reading vertically vs horizontally) is slightly unconventional

Grok Imagine Image

+ Successfully included all 6 numbered steps plus a crew section
+ Very accurate adherence to the specific iconography requests for each step
+ Creative composition that uses the bottom of the frame as the lunar surface

− Multiple spelling errors in the text (e.g., '3rajoory', 'Transluiory', 'Moom')
− Icon for the Saturn V looks less like the actual rocket compared to Model A

Verdict: FLUX.2 [flex] produced a much more professional and aesthetically pleasing infographic that looks like a finished product, though it missed the final step of the requested list. Grok Imagine followed the prompt's structural instructions more closely by including all six steps and specific icons, but it suffers from poor text rendering and slightly less refined vector art. FLUX.2 [flex] is the preferred choice for its superior visual quality and clean execution.

Next steps

Explore each model

FLUX.2 [flex]

Black Forest Labs

Black Forest Labs' precision image generation model with maximum control, reliable text rendering, and complete creative control supporting up to 4MP output

Vote this model in the arena

Arena profile Lumenfall catalog

Grok Imagine Image

xAI

An image generation model by xAI designed to generate highly aesthetic images from text descriptions.

Vote this model in the arena

Arena profile Lumenfall catalog