OpenAI's previous image generation model that accepts both text and image inputs and produces image outputs
Settled by community votes across 1 shared challenge, with an AI judge weighing in on each.
GPT Image 1
#19 of 23 in Image Editing
Not enough comparable category data
The chart appears once both models have ratings across at least three shared arena categories.
Grok Imagine Image Pro
#14 of 44 in Text-to-Image
Where the votes landed
GPT Image 1
0%
win rate
Ties
0%
Grok Imagine Image Pro
0%
win rate
Challenge by challenge
The strongest take from each model on every shared challenge, with the AI judge's read.
Outfit Transfer Challenge
Editing“Use Image 1 as the base person. Dress them in the exact elaborate outfit from Image 2 (including all layers, accessories, jewelry, and shoes). Carefully adapt the clothing to the body shape and pose in Image 1 while maintaining realistic fabric behavior, correct proportions, and perfect lighting/shadow matching. Keep the person’s exact face, hair, and background completely unchanged.”
AI Judge Analysis
GPT Image 1
- + Excellent replication of the outfit from Image 2, including the specific coat, plaid scarf, watch, and jeans.
- + Near-perfect facial and background consistency compared to Image 1.
- + Realistic lighting and shadows that match the environment of the source image.
Grok Imagine Image Pro
- + Keeps the background and person's head very close to the original source image.
- − Completely ignored the clothing in Image 2, generating a generic royal outfit instead.
- − The hands are a different skin tone than the face, leading to a major anatomical inconsistency.
- − Failed the primary instruction to use the 'exact elaborate outfit' from the reference.
Verdict: GPT Image 1 followed the instructions almost perfectly, successfully transferring the specific pea coat, plaid scarf, and watch from Image 2 while maintaining the identity of the person in Image 1. Grok Imagine Image Pro completely failed the prompt by generating a random royal costume that was not present in any source image and suffered from inconsistent skin tones on the hands.
Explore each model
xAI's premium image generation model offering higher fidelity output and stronger performance on single-image editing benchmarks compared to the standard Grok Imagine model