OpenAI's previous image generation model that accepts both text and image inputs and produces image outputs
Settled by community votes across 1 shared challenge, with an AI judge weighing in on each.
GPT Image 1
#19 of 23 in Image Editing
Not enough comparable category data
The chart appears once both models have ratings across at least three shared arena categories.
GPT Image 2
#3 of 44 in Text-to-Image
Where the votes landed
GPT Image 1
0%
win rate
Ties
0%
GPT Image 2
0%
win rate
Challenge by challenge
The strongest take from each model on every shared challenge, with the AI judge's read.
Outfit Transfer Challenge
Editing“Use Image 1 as the base person. Dress them in the exact elaborate outfit from Image 2 (including all layers, accessories, jewelry, and shoes). Carefully adapt the clothing to the body shape and pose in Image 1 while maintaining realistic fabric behavior, correct proportions, and perfect lighting/shadow matching. Keep the person’s exact face, hair, and background completely unchanged.”
AI Judge Analysis
GPT Image 1
- + Excellent fabric texture and realistic lighting on the pea coat.
- + Successfully captures the watch and plaid pattern from the reference image.
- + Maintains a high level of facial detail and similarity to the original person.
- − The position of the hands in the pockets resulted in some slight anatomical awkwardness in the right arm.
- − Cropped the original image significantly, losing some of the background context.
GPT Image 2
- + Near-perfect preservation of the original background and framing.
- + Very accurate replication of the specific plaid scarf and gold watch from Image 2.
- + Maintains the exact vitiligo patterns on the face and the specific hair styling of the original subject.
- − The transition between the neck and the collar of the coat is slightly rough with some minor artifacts.
- − The left hand in the pocket has a slightly blurred, less defined appearance compared to the rest of the image.
Verdict: Both models performed exceptionally well, successfully transferring the complex layers (coat, scarf, shirt, jeans, watch) while preserving the identity of the person in Image 1. Model B is the winner because it maintained the original image's aspect ratio and framing while achieving a slightly better likeness of the original subject's unique features.
Explore each model
OpenAI's state-of-the-art image generation model with arbitrary resolution up to 4K and strong instruction following