GPT Image 1.5 vs Grok Imagine Image Pro

Head-to-head across 16 challenges

GPT Image 1.5

50.0%

win rate

Ties

12.5%

Grok Imagine Image Pro

37.5%

win rate

50.0% 12.5% ties 37.5%

Challenge Results

Fantasy Warrior

Text-to-Image

“Close portrait of a battle-worn paladin in ornate engraved plate armor, hair braided with small beads, faint scars and dirt on the skin, warm torchlight reflecting off metal, shallow depth of field, bokeh sparks, lifelike eyes, highly detailed texture on leather straps and cloth underlayer.”

GPT Image 1.5
Grok Imagine Image Pro

AI Judge Analysis

GPT Image 1.5

  • + Excellent depiction of skin texture, sweat, and subtle scarring.
  • + Superior lighting effects with warm torchlight glints on the metal and hair.
  • + Very high level of detail on the ornate engraving and leather textures.
  • The hair braids and beads are a bit messy and less distinct than in the other model.
  • The composition is a bit more 'standard' fantasy portraiture.

Grok Imagine Image Pro

  • + Impressive rendering of legible Latin text ('Lux in tenebris') on the gorget.
  • + Unique hair styling with very clear beads and braided structure.
  • + Great balance between the rugged character face and the highly ornate armor.
  • The skin texture appears slightly more smoothed and less 'battle-worn' than Model A.
  • The warmth of the torchlight is less integrated into the highlights of the face.

Verdict: GPT Image 1.5 wins on sheer photographic realism and the 'battle-worn' aesthetic, providing incredible detail in the skin, scars, and dynamic lighting. Grok Imagine Image Pro is also excellent, particularly for its ability to render legible text on the armor and distinct braided details, but it feels slightly more like a rendered game character compared to the lifelike quality of the GPT image.

Geometric Composition

Text-to-Image

“A glass cube on a wooden table. Inside the cube is a small blue sphere. On top of the cube sits a red book. A green plant is behind the cube, partially visible through the glass. Soft window light from the left.”

GPT Image 1.5
Grok Imagine Image Pro
0% wins 0% ties 100% wins

AI Judge Analysis

GPT Image 1.5

  • + Perfectly depicts the small blue sphere inside the cube.
  • + Highly realistic glass reflections and refractions of the background plant.
  • + Excellent lighting consistency from the left window.
  • The sphere is quite large relative to the cube, rather than 'small' as requested.

Grok Imagine Image Pro

  • + Good composition with a more interesting plant choice (Monstera).
  • + Follows the instruction for a 'small' sphere better than Model A.
  • + Accurate placement of all requested elements.
  • The glass cube has a significant rendering error where a second sphere or reflection appears physically detached on the right.
  • The perspective of the cube's base and the table surface is slightly warped.

Verdict: GPT Image 1.5 produced a much more coherent and realistic image with physically accurate glass behavior and lighting. While Grok Imagine Image Pro attempted a better scale for the 'small' sphere, it suffered from a major glitch inside the cube that looks like a duplicated object and has less convincing glass textures.

Candid Street Photography

Text-to-Image

“A candid street photo of an elderly Japanese man repairing a red bicycle in light rain, reflections on wet pavement, shallow depth of field, 50mm lens, natural skin texture, imperfect framing, motion blur from passing cars, cinematic but realistic, no stylization.”

GPT Image 1.5
Grok Imagine Image Pro
100% wins 0% ties 0% wins

AI Judge Analysis

GPT Image 1.5

  • + Excellent depiction of light rain with visible droplets on the jacket and bicycle
  • + Very convincing motion blur from the passing car as requested
  • + Effective 'imperfect framing' that enhances the candid, street-photography aesthetic
  • The rear bicycle wheel has some structural clipping and AI artifacts near the gear system

Grok Imagine Image Pro

  • + Natural and detailed skin texture on the man's face and hands
  • + Clear reflections in the puddles on the pavement
  • + Accurate representation of a red Japanese-style utility bicycle
  • The 'motion blur' on the cars in the background feels static or like light trails rather than a moving vehicle
  • The scene lacks the atmosphere of rain; the man and bike look almost entirely dry

Verdict: GPT Image 1.5 followed the atmospheric instructions much better, capturing the texture of light rain and the specific motion blur of a passing car, which gives it a more authentic 'candid' feel. While Grok Imagine Image Pro has nice skin details, it fails to make the subject actually appear as if he is standing in the rain, making the scene feel staged.

Modern Clean Menu

Text-to-Image

“Modern minimalist restaurant menu design, white background with colorful food photos in grid, sections for appetizers/pizza/mains, bold sans-serif fonts, vibrant accents, clean professional layout for casual dining.”

GPT Image 1.5
Grok Imagine Image Pro
100% wins 0% ties 0% wins

AI Judge Analysis

GPT Image 1.5

  • + Excellent text rendering with clear item names, descriptions, and prices
  • + Functional layout that balances text and imagery like a real menu
  • + High-quality, appetizing food photography that looks professional
  • The 'Mains' text is slightly cut off at the bottom
  • Grid layout for photos is less uniform than requested

Grok Imagine Image Pro

  • + Very clean and precise 3x3 photo grid
  • + High-quality, vibrant food images with consistent lighting
  • + Minimalist aesthetic follows the visual prompt closely
  • Completely fails to include the actual menu items, descriptions, or pricing
  • Impractical as a restaurant menu without text content

Verdict: GPT Image 1.5 produced a fully functional and professional restaurant menu with impeccable text rendering and a logical layout. Grok Imagine Image Pro interpreted the 'grid' instruction well but failed to include any of the text content required for a menu, resulting in just a collection of photos. GPT Image 1.5 is the clear winner for its superior prompt adherence and utility.

Bald man challenge

Editing
Edit instruction

“Give the person a full, thick head of natural hair with realistic texture, density, and a natural hairline. Preserve facial features and lighting.”

Before After
GPT Image 1.5
Before After
Grok Imagine Image Pro
0% wins 0% ties 100% wins

AI Judge Analysis

GPT Image 1.5

  • + Natural, curly hair texture that matches the beard style
  • + Adds hair to the sideburn area seamlessly
  • Slightly alters the original facial features, making the face look a bit younger/smoother
  • Visible blending artifact near the top of the glasses frame

Grok Imagine Image Pro

  • + Excellent preservation of the original facial features and skin texture
  • + Perfectly maintains the background and clothing without any shifts
  • + Realistic hairline and lighting integration
  • Hair looks slightly 'pasted on' on the far left side where it meets the background

Verdict: Grok Imagine Image Pro is the winner because it successfully added hair while perfectly preserving the identity, facial wrinkles, and glasses of the subject from the source image. GPT Image 1.5 provided a good result but noticeably altered the subject's face, making him look like a different person with similar features.

Night Sky Transformation

Editing
Edit instruction

“Change the scene to night: a deep, dark sky with subtle, glistening stars visible behind the mountain.”

Before After
GPT Image 1.5
Before After
Grok Imagine Image Pro
100% wins 0% ties 0% wins

AI Judge Analysis

GPT Image 1.5

  • + Successfully transforms the entire color palette to a deep night theme.
  • + Maintains the silhouettes of the mountains and town accurately.
  • + Produces a complex, realistic starry sky with a hint of the Milky Way.
  • Significantly darkens the town and river to the point where visual detail is lost.
  • The transition between the mountain peak and the sky is slightly soft/blurry.

Grok Imagine Image Pro

  • + Excellent source preservation, capturing the town and foreground exactly as they were but with night lighting.
  • + The starry sky is clean and fits the requested 'subtle, glistening' description perfectly.
  • + Maintains much better contrast and visibility in the valley compared to the other model.
  • The sky has a slight grid-like or artificial pattern to the star distribution in some areas.
  • Less 'moody' than the other model, feeling a bit more like a composite.

Verdict: Grok Imagine Image Pro is the winner because it successfully applied the night theme while preserving the intricate details of the town and landscape from the source image. GPT Image 1.5 achieved a more atmospheric and realistic night sky, but it darkened the lower half of the image so much that the town's unique character was partially lost.

Adorable Baby Animals in Sunny Meadow

Text-to-Image

“Hyper-photorealistic scene of fluffy baby animals—a golden retriever puppy, tabby kitten, baby bunny, and red fox kit—with big expressive eyes and ultra-detailed soft fur, playfully chasing butterflies and tumbling together in a lush wildflower meadow, warm golden sunrise light with god rays and dew sparkles, joyful wholesome vibe, 8K masterpiece.”

GPT Image 1.5
Grok Imagine Image Pro

AI Judge Analysis

GPT Image 1.5

  • + Excellent depiction of fur texture and soft lighting
  • + Tight, engaging composition that feels more intimate
  • + Perfect adherence to the animal count and species requested
  • The butterfly scale is a bit large compared to the animals

Grok Imagine Image Pro

  • + Dynamic poses showcasing movement and play
  • + Beautiful wide landscape with good depth of field
  • + Clean rendering of the meadow and flowers
  • Failed the prompt by including two kittens instead of one
  • The fox's tail and hind leg anatomy looks slightly awkward

Verdict: GPT Image 1.5 is the winner because it followed all instructions, including the specific count of animals, and produced a more cohesive and heartwarming masterpiece with 'ultra-detailed' fur. Grok Imagine Image Pro had a lovely landscape but failed the prompt by adding an extra kitten and had slightly less realistic fur textures.

Over-the-top cartoon caricature

Editing
Edit instruction

“Create a caricature of me and my job. Make it exaggerated and humorous, incorporating my profession as a tv show anchor and my love for dogs and hockey.”

Source
GPT Image 1.5
Grok Imagine Image Pro
67% wins 33% ties 0% wins

AI Judge Analysis

GPT Image 1.5

  • + Excellent character likeness preserved while applying the caricature style.
  • + High visual quality with vibrant colors and professional studio lighting.
  • + Clever integration of hobbies, especially the dog wearing a hockey helmet.
  • The hand holding the microphone has a slight anatomical issue with the thumb placement.

Grok Imagine Image Pro

  • + Strong prompt adherence with many elements including a trophy and multiple dogs.
  • + Good text rendering for 'Pups & Pucks'.
  • + Effective caricature magnification of facial features.
  • Loses significant likeness of the woman in the source image.
  • Overall composition is a bit cluttered and chaotic compared to Model A.
  • The hockey stick held by the woman is thin and warped.

Verdict: GPT Image 1.5 is the clear winner because it manages to create a humorous caricature while still being instantly recognizable as the woman from the source image. Grok Imagine Image Pro includes more elements from the prompt but loses the specific facial identity of the subject, opting for a more generic caricature face.

Victorian Greenhouse Oasis

Text-to-Image

“Hyper-photorealistic interior of a lush Victorian glass greenhouse filled with exotic tropical plants, vibrant blooming orchids, tall ferns, colorful butterflies in flight, sunlight filtering through ornate glass roof creating realistic caustics and dew on leaves, intricate iron framework visible, misty atmosphere, 8K masterpiece.”

GPT Image 1.5
Grok Imagine Image Pro

AI Judge Analysis

GPT Image 1.5

  • + Exceptional lighting effects with realistic god rays and caustics.
  • + High level of detail on dew drops and texture of the tropical leaves.
  • + Atmospheric mist provides a strong sense of depth and realism.
  • The butterflies appear somewhat flat and lack the texture of the surrounding environment.
  • Some of the ironwork details become slightly blurry in the background.

Grok Imagine Image Pro

  • + Very clean and symmetrical composition with a natural lead-in path.
  • + Excellent rendering of the Victorian ironwork structure and glass panes.
  • + Wide variety of vibrant orchids and lush ferns throughout the scene.
  • The lighting feels more flat and digital compared to the requested 'hyper-photorealistic' atmosphere.
  • The butterflies are distributed in a way that feels slightly artificial or 'copy-pasted'.

Verdict: GPT Image 1.5 captures the requested atmosphere more effectively, utilizing dramatic lighting, realistic dew, and misty textures to create a truly immersive scene. While Grok Imagine Image Pro offers a cleaner composition and better structure for the Victorian greenhouse itself, it lacks the 'hyper-photorealistic' lighting and caustic effects explicitly mentioned in the prompt, making GPT Image 1.5 the superior interpretation of the creative requirements.

Heroic Super Hero Portrait

Text-to-Image

“Hyper-photorealistic full-body portrait of a female superhero standing triumphantly on a New York skyscraper rooftop at golden sunset, wearing a classic modest superhero costume with flowing cape, chest emblem, gloves, and boots in red and blue colors, practical design, short hair, strong determined heroic expression looking into the distance, powerful confident stance with hands on hips and cape billowing dramatically in the wind, detailed urban cityscape background, warm natural sunlight with sharp shadows and fabric highlights, ultra-sharp textures on suit, hair, and concrete, 8K masterpiece, empowering family-friendly style.”

GPT Image 1.5
Grok Imagine Image Pro

AI Judge Analysis

GPT Image 1.5

  • + Excellent cityscape detail including the Empire State and Chrysler buildings
  • + Great suit texture and fabric realism
  • + Strong adherence to the 'short hair' and 'looking into distance' prompt requirements
  • The personified 'Supergirl' emblem is derivative rather than original
  • Cape attachment logic is slightly messy at the shoulders

Grok Imagine Image Pro

  • + Highly symmetrical and powerful composition
  • + Better full-body framing showing the legal clearance of the character on the roof
  • + Original emblem design adds to the creativity
  • The city skyline is generic and does not clearly represent New York as requested
  • Lighting is a bit flat across the face compared to the golden hour prompt

Verdict: GPT Image 1.5 wins due to its superior environmental detail; it captures an unmistakable New York City with iconic landmarks and beautiful golden hour lighting. Grok Imagine Image Pro has a strong character design, but the city background is generic and lacks the 'New York' specificity requested in the prompt.

Studio Ghibli Anime Style

Editing
Edit instruction

“Transform this photo into a Studio Ghibli–inspired illustration. Use soft pastel colors, hand-painted textures, gentle lighting, dreamy backgrounds, and a warm, nostalgic mood”

Source
GPT Image 1.5
Grok Imagine Image Pro
0% wins 0% ties 100% wins

AI Judge Analysis

GPT Image 1.5

  • + Excellent atmospheric lighting and glow consistent with Ghibli's 'dreamy' themes
  • + Successfully captures the soft pastel palette requested in the prompt
  • + Maintains the characteristic expressions and composition of the source image while stylizing them
  • The excessive glow makes some edges and details feel a bit too blurry
  • The textures feel more like digital 'noise' or sparkles than hand-painted watercolor

Grok Imagine Image Pro

  • + Perfectly captures the hand-painted watercolor texture typical of Ghibli background art
  • + Excellent preservation of the original image's character silhouettes and layout
  • + Clean line work that mimics traditional cel animation
  • Colors are slightly less 'dreamy' or vibrant than requested
  • The lighting is flat compared to the requested 'gentle lighting' and nostalgic mood

Verdict: Both models did an excellent job transforming the 'Distracted Boyfriend' meme into a Ghibli style. GPT Image 1.5 leaned into the emotional atmosphere with warm lighting and soft focus, while Grok Imagine Image Pro excelled at recreating the physical medium of watercolor and ink. Grok Imagine Image Pro is the likely winner for its superior 'hand-painted' texture and cleaner preservation of the source image's identity.

Intricate Floral Mandala

Text-to-Image

“Perfectly symmetrical mandala made entirely of real flowers, petals, leaves, fruits, and seeds in vibrant natural colors, intricate layered patterns with radial symmetry, top-down view on a soft neutral background, hyper-detailed organic textures and subtle shadows, photorealistic, 8K masterpiece.”

GPT Image 1.5
Grok Imagine Image Pro
0% wins 50% ties 50% wins

AI Judge Analysis

GPT Image 1.5

  • + Exceptional radial symmetry with mathematically precise placement
  • + Very clean organic textures without any AI-smearing
  • + Perfectly consistent lighting and subtle shadows across the entire arrangement
  • The high level of symmetry makes it look slightly more like a digital illustration than a physical arrangement of objects

Grok Imagine Image Pro

  • + Included a wider variety of items like sliced fruits (pomegranate, citrus) and walnuts
  • + Natural lighting depth gives a stronger sense of a physical 3D heap of items
  • Poor symmetry; items on the left do not match the right in shape or count
  • Some objects, like the pomegranate seeds, appear slightly distorted upon close inspection

Verdict: GPT Image 1.5 followed the prompt for a 'perfectly symmetrical mandala' much more effectively than Grok Imagine Image Pro, which struggled to maintain a consistent pattern across the axes. While Grok Imagine Image Pro utilized more interesting materials like sliced fruits and nuts, GPT Image 1.5's technical execution, clarity, and adherence to the symmetry requirement make it the superior 8K masterpiece.

Neutral Expression to Genuine Smile

Editing
Edit instruction
{
  "action": "image_edit",
  "reference": "uploaded neutral portrait",
  "change": "Warm genuine Duchenne smile: lips curved up, slight natural teeth, soft eye crinkles, subtle cheek raise",
  "details": "Realistic smiling skin (dimples if present, soft cheek shadows), slightly brighter eyes; keep exact eye shape/color/iris",
  "preserve_exact": "Face identity/structure, eyes/nose/lips/eyebrows, hair, skin texture/pores/freckles, makeup, clothing, head pose, background, lighting, shadows, framing",
  "no_changes": "No face shape change, no new features, no gaze shift, no hair/clothing/lighting/background edits",
  "style": "Ultra-photorealistic 8K portrait, sharp face focus, natural soft lighting, realistic skin glow"
}
Before After
GPT Image 1.5
Before After
Grok Imagine Image Pro

AI Judge Analysis

GPT Image 1.5

  • + Natural Duchenne smile with soft eye crinkles
  • + High accuracy in facial feature preservation
  • + Maintains skin texture and light freckles from the original
  • Slightly wider mouth and change in tooth shape compared to the most natural anatomical progression

Grok Imagine Image Pro

  • + Excellent preservation of face shape and iris detail
  • + Natural skin sheen on the cheeks due to the smile
  • + Highly realistic dental structure
  • The eyes remain a bit too open for a deep Duchenne smile
  • Very slight change in the nose bridge width compared to source

Verdict: Both models perform exceptionally well, maintaining the identity and hair of the original subject perfectly. GPT Image 1.5 better captures the 'Duchenne' aspect with more realistic crinkling around the eyes, though Grok Imagine Image Pro produces slightly more realistic skin highlights and shadows around the nasolabial folds.

Golden Hour Stroll

Editing
Edit instruction

“Add dynamic motion to this photo: make hair blow in the wind, add leaves flying, energetic and lively feel.”

Source
GPT Image 1.5
Grok Imagine Image Pro
50% wins 0% ties 50% wins

AI Judge Analysis

GPT Image 1.5

  • + Excellent hair motion effect that looks integrated with the person
  • + Large number of leaves creates a strong sense of wind
  • + High preservation of the original subjects' faces and clothing
  • Some leaves appear flat or like stickers overlaying the image
  • A few leaves have slightly unnatural coloration

Grok Imagine Image Pro

  • + Leaves have more varied colors and natural shapes
  • + Good hair motion that follows the wind direction
  • + Maintains original image quality and details perfectly
  • Fewer leaves results in a slightly less 'energetic' feel than requested
  • A few leaves appear slightly blurry in the foreground

Verdict: Both models did an exceptional job of following the edit instructions while preserving the source image. GPT Image 1.5 feels more 'dynamic' and 'lively' because of the sheer volume of leaves, while Grok Imagine Image Pro feels a bit more natural but slightly more conservative with the effect. GPT Image 1.5 is the winner for more fully realizing the scale of the request.

Vintage Cafe Logo

Text-to-Image

“Vintage minimalist restaurant logo for "Caffè Florian", retro cloche dome with steam and "Est. 1720" banner, classic typography, warm brown and cream tones, subtle texture on light background, vector emblem style.”

GPT Image 1.5
Grok Imagine Image Pro

AI Judge Analysis

GPT Image 1.5

  • + Excellent typography with era-appropriate flourishes.
  • + Beautiful hand-drawn texture and lighting on the cloche.
  • + High-quality vector emblem aesthetic with a balanced vertical layout.
  • Ignored the 'light background' instruction, opting for solid black.
  • The cloche is brown rather than a classic metallic look.

Grok Imagine Image Pro

  • + Adhered perfectly to the light background and warm brown/cream color palette.
  • + Excellent minimalist vector style that feels modern yet vintage.
  • + Accurate rendering of the requested 'Est. 1720' banner and steam.
  • The steam element is a bit overly simplistic and disconnected.
  • The cloche is grey, which slightly clashes with the warm brown/cream theme.

Verdict: GPT Image 1.5 produces a more visually stunning and atmospheric logo with superior typography, but it completely fails to follow the negative space instruction for a light background. Grok Imagine Image Pro successfully follows every part of the prompt, including color scheme and background, delivering a clean and professional minimalist vector logo.

Apollo 11: Journey to Tranquility

Text-to-Image

“Create a clean, modern vector infographic poster about the Apollo 11 mission. NASA-inspired palette (navy, white, muted red, light gray). Flat-vector style, crisp lines, consistent iconography, subtle gradients only. Steps (stop at landing): 1. Launch (Saturn Vicon) 2. Earth Orbit (Earth + orbit ring icon) 3. Translunar (trajectory arc icon) 4. Lunar Orbit (Moon + orbit ring icon) 5. Descent (lunar module descending icon) 6. Landing (lunar module on the surface icon) Small supporting elements (minimal text): • Crew strip: three silhouette icons with only last names: Armstrong, Aldrin, Collins. • Landing site marker: Moon pin labeled "Tranquility" only. Layout constraints: generous margins, large readable labels, clean background with subtle stars. Vector-only, print-poster look, high resolution.”

GPT Image 1.5
Grok Imagine Image Pro
100% wins 0% ties 0% wins

AI Judge Analysis

GPT Image 1.5

  • + Excellent illustration quality with detailed but clean vector icons.
  • + Highly legible, large typography for each step.
  • + Followed the color palette and flat-vector style requirements perfectly.
  • The layout is a bit cramped at the top with overlapping elements.
  • The trajectory in 'Translunar' is a bit abstracted compared to the other literal steps.

Grok Imagine Image Pro

  • + Perfectly balanced, professional infographic composition.
  • + Flawless text rendering and numbering for all six steps.
  • + Strict adherence to the 'consistent iconography' and 'NASA-inspired palette' instructions.
  • Icons are much smaller and less detailed than Model A.
  • The background is slightly plain, feeling more like a slide than a poster.

Verdict: Both models followed the prompt exceptionally well, capturing the NASA aesthetic and the six specific mission steps. Model A (GPT Image 1.5) features much stronger and more vibrant individual illustrations, while Model B (Grok Imagine Image Pro) excels in professional layout, spacing, and precise text rendering. Model B is chosen as the winner for its superior infographic structure which feels more like a finished, functional design.

GPT Image 1.5

OpenAI's state-of-the-art image generation model with better instruction following and adherence to prompts

Grok Imagine Image Pro

xAI's premium image generation model offering higher fidelity output and stronger performance on single-image editing benchmarks compared to the standard Grok Imagine model