GPT Image 1 Mini OpenAI Grok Imagine Image Pro xAI

Settled by community votes across 8 shared challenges, with an AI judge weighing in on each.

GPT Image 1 Mini

25.3 arena score

#12 of 44 in Text-to-Image

Skill signature

Not enough comparable category data

The chart appears once both models have ratings across at least three shared arena categories.

Grok Imagine Image Pro

24.8 arena score

#14 of 44 in Text-to-Image

Vote tally

Where the votes landed

GPT Image 1 Mini

100.0%

win rate

Ties

0.0%

Grok Imagine Image Pro

0.0%

win rate

100.0% 0.0% ties 0.0%

Shared challenges 8

Challenge by challenge

The strongest take from each model on every shared challenge, with the AI judge's read.

Geometric Composition

Text-to-Image

“A glass cube on a wooden table. Inside the cube is a small blue sphere. On top of the cube sits a red book. A green plant is behind the cube, partially visible through the glass. Soft window light from the left.”

GPT Image 1 Mini

Grok Imagine Image Pro

100% wins 0% ties 0% wins

AI Judge Analysis

GPT Image 1 Mini

+ Excellent photographic realism and texture in the book and sphere.
+ Accurate lighting direction consistent with the prompt.
+ Perfectly clear glass rendering with subtle reflections.

− The blue sphere is floating or partially intersecting the back glass pane rather than sitting on the surface.
− The cube structure has an odd, thin metal-like frame rather than being solid glass.

Grok Imagine Image Pro

+ Physically grounded sphere sitting on the base of the cube.
+ Dynamic refraction of the sphere and plant within the thick glass.

− The glass geometry is warped and inconsistent at the corners.
− The blue sphere is inexplicably reflected/duplicated on the right side within the glass despite no physical wall being there.

Verdict: GPT Image 1 Mini produces a much more aesthetically pleasing and high-resolution image with superior textures, though the sphere appears to be levitating inside the cube. Grok Imagine Image Pro handles the physical placement of the sphere and the plant's visibility through the glass more realistically, but suffers from significant glass distortion and a strange ghosting artifact of the sphere.

Candid Street Photography

Text-to-Image

“A candid street photo of an elderly Japanese man repairing a red bicycle in light rain, reflections on wet pavement, shallow depth of field, 50mm lens, natural skin texture, imperfect framing, motion blur from passing cars, cinematic but realistic, no stylization.”

GPT Image 1 Mini

Grok Imagine Image Pro

AI Judge Analysis

GPT Image 1 Mini

+ Excellent shallow depth of field and bokeh transition
+ Strong adherence to 'imperfect framing' with a close-up, candid feel
+ Atmospheric lighting and realistic ground reflections

− Anatomical issues with the man's hands merging into the bike spokes
− Missing visible motion blur from passing cars
− The bicycle has structural issues where the kickstand and frame meet

Grok Imagine Image Pro

+ Successfully includes motion blur from passing cars
+ Better anatomical correctness in the hands and clear use of a tool
+ Environment feels more like a specific Japanese street scene

− Depth of field is slightly too deep compared to the 50mm request
− Street reflections are less pronounced than in Model A
− The image looks a bit more 'clean' and less like a moody candid shot

Verdict: Grok Imagine Image Pro is the winner for its better handling of complex details like hands and tools, as well as capturing the motion blur requested in the prompt. While GPT Image 1 Mini has a more cinematic and moody aesthetic with superior bokeh, it suffers from significant AI artifacts where the person interacts with the bicycle.

Fantasy Warrior

Text-to-Image

“Close portrait of a battle-worn paladin in ornate engraved plate armor, hair braided with small beads, faint scars and dirt on the skin, warm torchlight reflecting off metal, shallow depth of field, bokeh sparks, lifelike eyes, highly detailed texture on leather straps and cloth underlayer.”

GPT Image 1 Mini

Grok Imagine Image Pro

100% wins 0% ties 0% wins

AI Judge Analysis

GPT Image 1 Mini

+ Excellent photographic realism in skin texture and eyes
+ Naturalistic lighting with subtle warm hues
+ Very detailed engraving on the plate armor

− Failed to include the requested beads in the braided hair
− Scars are very faint, leaning more towards just dirt

Grok Imagine Image Pro

+ Followed all prompt instructions including hair beads and distinct scars
+ Impressive detail on leather straps and cloth underlayer
+ Dynamic composition with clear Latin text engraving

− The sparks look like flat digital streaks rather than bokeh light
− Skin texture looks slightly more airbrushed compared to Model A

Verdict: Grok Imagine Image Pro is the winner because it followed every specific detail of the prompt, including the beads in the hair and the leather straps/cloth textures, which GPT Image 1 Mini missed. While GPT Image 1 Mini has slightly more realistic skin rendering, Grok Imagine Image Pro provides a more complete and visually interesting interpretation of a battle-worn paladin.

Chalkboard Menu

Text-to-Image

“Handwritten-style chalkboard menu in a cozy café, all text rendered in the exact same realistic chalk handwriting style with natural variations in letter size, slight slant, and chalk texture — no printed or digital fonts anywhere on the board. Title at the top in elegant cursive chalk handwriting: ‘TODAY’S SPECIALS – APRIL 30, 2026’. Below it, three menu items also in the same handwritten chalk style: ‘Truffle Mushroom Risotto – $24’, ‘Grilled Octopus with Lemon & Herbs – $28’, ‘Brown Butter Chocolate Chip Cookies – $9’. At the very bottom, smaller text in the identical handwritten chalk style (slightly smaller but still clearly legible with the same handwriting characteristics): ‘All items made fresh daily • Ask about our gluten-free options’. Warm ambient café lighting, visible chalk dust and smudges, realistic handwriting imperfections, no clean printed text anywhere.”

GPT Image 1 Mini

Grok Imagine Image Pro

AI Judge Analysis

GPT Image 1 Mini

+ Excellent chalk texture with grainy, realistic strokes
+ Perfectly centered and clean composition
+ High-fidelity rendering of the wooden frame

− Failed to provide 'elegant cursive' for the title as requested
− Text looks a bit too uniform, bordering on a digital font appearance

Grok Imagine Image Pro

+ Successfully used cursive for the menu items to create a more authentic handwritten feel
+ Captures the 'cozy café' atmosphere with a visible background and environmental lighting
+ Variation in handwriting styles makes it look more realistic

− The title is in print instead of the requested 'elegant cursive'
− Text layout is slightly less balanced with some spacing issues near the prices

Verdict: Both models followed the complex text instructions very well, with almost no spelling errors. GPT Image 1 Mini produced a cleaner, more readable board with superior chalk texture, but Grok Imagine Image Pro felt more realistic as a photograph because of the café background and the varied handwriting styles. Grok is the narrow winner for capturing the 'handwritten' and 'cozy' spirit of the prompt more effectively despite the title not being cursive.

Pose & Character Mashup

Editing

Edit instruction

“Use Image 1 as the exact pose reference and Image 2 as the character reference. Recreate the person/character from Image 2 in the exact dynamic pose and body position from Image 1. Keep the exact face, hair, clothing style/details, and expression from Image 2. Match the lighting and environment of Image 1. The final image must show the character from Image 2 performing the precise action/pose from Image 1 with perfect anatomy and natural integration.”

Source

GPT Image 1 Mini

Grok Imagine Image Pro

AI Judge Analysis

GPT Image 1 Mini

+ Successfully transferred the character's facial features, sunglasses, and clothing style
+ Accurately matched the lighting and background color of the first source image
+ Correctly interpreted the identity transformation requested

− Significantly simplified and altered the specific complex pose from Image 1
− Anatomical issues in the right hand and feet placement compared to the original pose

Grok Imagine Image Pro

+ Maintained the exact complex pose from Image 1 with high fidelity
+ High visual quality and resolution of the overall image

− Completely failed to use the character from Image 2 as requested
− Only slightly modified the face of the original woman from Image 1 instead of replacing her with the man from Image 2

Verdict: GPT Image 1 Mini followed the core instruction of swapping the characters, successfully placing the man from Image 2 into the scene with his original clothing, despite struggling to recreate the exact complexity of the pose. Grok Imagine Image Pro completely ignored the character reference, essentially reproducing the source image with a slightly different feminine face. Therefore, GPT Image 1 Mini is the winner for actually attempting and largely achieving the person-transfer edit.

Outfit Transfer Challenge

Editing

Edit instruction

“Use Image 1 as the base person. Dress them in the exact elaborate outfit from Image 2 (including all layers, accessories, jewelry, and shoes). Carefully adapt the clothing to the body shape and pose in Image 1 while maintaining realistic fabric behavior, correct proportions, and perfect lighting/shadow matching. Keep the person’s exact face, hair, and background completely unchanged.”

Source

GPT Image 1 Mini

Grok Imagine Image Pro

AI Judge Analysis

GPT Image 1 Mini

+ Successfully extracted and applied the exact outfit from Image 2, including the specific plaid pattern and coat style.
+ Maintains the skin vitiligo details on the hands and face.
+ The composition and perspective feel cohesive with the new body pose.

− Changed the person's face and head shape significantly from Image 1.
− The sand on the face and the specific hair patch from Image 1 were lost in the generation.
− The transition between the neck and the clothing is slightly blurry.

Grok Imagine Image Pro

+ Perfectly preserved the original face and hair from Image 1.
+ High-quality rendering of fabric textures and lighting.
+ Maintains the exact wooden structure and background from the source image.

− Completely ignored the instruction to use the outfit from Image 2, instead generating a generic royal costume.
− Created white-skinned hands that do not match the person in Image 1.

Verdict: This is a case of two different failures: GPT Image 1 Mini captured the correct outfit from Image 2 but failed to preserve the person's identity from Image 1, while Grok Imagine Image Pro perfectly preserved the face/background but completely ignored the target outfit. Model A is slightly preferred because it at least attempted to combine elements from both images, whereas Model B hallucinated an entirely new set of clothing.

The Capybara Taxi Driver

Text-to-Image

“Photorealistic scene inside a yellow New York taxi at night. A capybara is driving, wearing a yellow taxi driver cap and a dark jacket. It has a calm, professional expression and both front paws on the steering wheel. In the back seat sits a human businesswoman in a coat, looking at her phone with a completely normal, bored expression (as if this is just another normal ride). Through the windows you can see the streets of Manhattan at night with blurred lights. Realistic taxi interior, photorealistic, detailed fur and fabric, 35mm lens, night lighting with reflections, shallow depth of field.”

GPT Image 1 Mini

Grok Imagine Image Pro

AI Judge Analysis

GPT Image 1 Mini

+ Excellent close-up detail on the capybara's fur and expression.
+ Strong atmospheric lighting and authentic bokeh effect.

− The passenger is very blurry and lacks detail.
− Only one paw is clearly visible on the steering wheel.

Grok Imagine Image Pro

+ Perfect adherence to all prompt details including the jacket, two paws on the wheel, and the woman's expression.
+ Impressive text rendering on the hat with specific NYC medallion details.
+ Very clear composition that shows both subjects and the city environment equally well.

− The capybara's paws look more like crab claws/talons than actual capybara paws.
− The perspective from the dashboard looking back is slightly less intimate than Model A.

Verdict: While GPT Image 1 Mini takes a portrait-style approach with great lighting, Grok Imagine Image Pro provides a much more complete realization of the prompt. Grok successfully captures the capybara's full outfit, the specific seating arrangement, and legible text on the driver's cap, making it the more accurate and detailed interpretation.

Adorable Baby Animals in Sunny Meadow

Text-to-Image

“Hyper-photorealistic scene of fluffy baby animals—a golden retriever puppy, tabby kitten, baby bunny, and red fox kit—with big expressive eyes and ultra-detailed soft fur, playfully chasing butterflies and tumbling together in a lush wildflower meadow, warm golden sunrise light with god rays and dew sparkles, joyful wholesome vibe, 8K masterpiece.”

GPT Image 1 Mini

Grok Imagine Image Pro

AI Judge Analysis

GPT Image 1 Mini

+ Excellent fur texture and lighting integration.
+ Natural, dynamic poses that feel like a real action snapshot.
+ Superior rendering of 'god rays' and soft, atmospheric depth.

− The fox's anatomy is slightly wonky, particularly the mouth/teeth area.
− The butterfly above the kitten lacks a shadow or realistic lighting interaction.

Grok Imagine Image Pro

+ Accurately included more varied butterflies and wildflowers.
+ Great interaction between animals, with the fox tumbling as requested.
+ Very sharp focus across all subjects.

− The lighting feels more synthetic and 'over-processed' compared to Model A.
− Added an extra kitten not explicitly requested, crowding the composition.
− The fur texture looks slightly more like a digital painting than a photograph.

Verdict: GPT Image 1 Mini captures the requested 'hyper-photorealistic' look much better, with natural lighting and soft focus that creates a convincing atmospheric scene. Grok Imagine Image Pro follows the specific actions (tumbling) and flower types more closely, but the image feels less realistic and more like a high-quality digital illustration.

Next steps

Explore each model

GPT Image 1 Mini

OpenAI

OpenAI's cost-effective image generation model for when image quality isn't the top priority

Vote this model in the arena

Arena profile Lumenfall catalog

Grok Imagine Image Pro

xAI

xAI's premium image generation model offering higher fidelity output and stronger performance on single-image editing benchmarks compared to the standard Grok Imagine model

Vote this model in the arena

Arena profile Lumenfall catalog