GPT Image 1 Mini OpenAI GPT Image 2 OpenAI

Settled by community votes across 4 shared challenges, with an AI judge weighing in on each.

GPT Image 1 Mini

25.3 arena score

#12 of 44 in Text-to-Image

Skill signature

Not enough comparable category data

The chart appears once both models have ratings across at least three shared arena categories.

GPT Image 2

28.2 arena score

#3 of 44 in Text-to-Image

Top 3 in Text-to-Image

Vote tally

Where the votes landed

GPT Image 1 Mini

win rate

Ties

GPT Image 2

win rate

Shared challenges 4

Challenge by challenge

The strongest take from each model on every shared challenge, with the AI judge's read.

Chalkboard Menu

Text-to-Image

“Handwritten-style chalkboard menu in a cozy café, all text rendered in the exact same realistic chalk handwriting style with natural variations in letter size, slight slant, and chalk texture — no printed or digital fonts anywhere on the board. Title at the top in elegant cursive chalk handwriting: ‘TODAY’S SPECIALS – APRIL 30, 2026’. Below it, three menu items also in the same handwritten chalk style: ‘Truffle Mushroom Risotto – $24’, ‘Grilled Octopus with Lemon & Herbs – $28’, ‘Brown Butter Chocolate Chip Cookies – $9’. At the very bottom, smaller text in the identical handwritten chalk style (slightly smaller but still clearly legible with the same handwriting characteristics): ‘All items made fresh daily • Ask about our gluten-free options’. Warm ambient café lighting, visible chalk dust and smudges, realistic handwriting imperfections, no clean printed text anywhere.”

GPT Image 1 Mini

GPT Image 2

AI Judge Analysis

GPT Image 1 Mini

+ Excellent text legibility and alignment.
+ Accurate spelling of all requested menu items.

− The font appears too uniform and digital, missing the 'handwritten' request.
− Lacks the 'cozy café' environmental context requested.

GPT Image 2

+ Successfully captures the elegant cursive and handwritten chalk texture.
+ Provides a superior café atmosphere with lighting and background details.

− The date '2026' is slightly less crisp than other text.
− Minor smudging on the board reduces legibility of smaller letters.

Verdict: GPT Image 2 is the clear winner for its superior interpretation of 'handwritten-style' and 'cozy café' atmosphere. While GPT Image 1 Mini has very clean text, it lacks the artistic cursive slant and realistic chalk textures found in GPT Image 2.

Pose & Character Mashup

Editing

Edit instruction

“Use Image 1 as the exact pose reference and Image 2 as the character reference. Recreate the person/character from Image 2 in the exact dynamic pose and body position from Image 1. Keep the exact face, hair, clothing style/details, and expression from Image 2. Match the lighting and environment of Image 1. The final image must show the character from Image 2 performing the precise action/pose from Image 1 with perfect anatomy and natural integration.”

Source

GPT Image 1 Mini

GPT Image 2

AI Judge Analysis

GPT Image 1 Mini

+ Matches the background and lighting of the source image perfectly.
+ Maintains the likeness of the individual from Image 2.
+ Good clothing detail preservation including the scarf pattern.

− Fails to replicate the specific crossed-leg pose from Image 1.
− Only shows one foot on the stool, altering the dynamic balance of the original pose.
− The right arm/hand is positioned differently than the reference.

GPT Image 2

+ Successfully replicates the complex crossed-leg pose from Image 1.
+ Preserves the character's facial features and accessories with high accuracy.
+ Incorporates the specific scarf and clothing from Image 2 while fitting the pose.

− The character's head is tilted slightly differently than Image 1's extreme angle.
− Small anatomy artifact where the left hand retains red fingernails from the woman in Image 1.

Verdict: GPT Image 2 is the superior output because it successfully captured the 'exact dynamic pose' requested, including the difficult crossed-leg position on the red stool. GPT Image 1 Mini failed to replicate the core mechanics of the pose, providing a much simpler lunging stance instead of the specific position seen in the source image.

Outfit Transfer Challenge

Editing

Edit instruction

“Use Image 1 as the base person. Dress them in the exact elaborate outfit from Image 2 (including all layers, accessories, jewelry, and shoes). Carefully adapt the clothing to the body shape and pose in Image 1 while maintaining realistic fabric behavior, correct proportions, and perfect lighting/shadow matching. Keep the person’s exact face, hair, and background completely unchanged.”

Source

GPT Image 1 Mini

GPT Image 2

AI Judge Analysis

GPT Image 1 Mini

+ Excellent replication of the outfit details including the coat, scarf pattern, and watch.
+ Successfully renders vitiligo patterns on the hands and arms to match the subject's face.
+ Good integration of the full body in the environment.

− Fails to keep the face and hair completely unchanged as requested.
− The posture and facial orientation of the person were altered significantly.

GPT Image 2

+ Successfully keeps the person's exact face and hair completely unchanged.
+ Maintains the original head tilt and leaning pose of the subject while applying the clothing.
+ Accurately replicates the layers and accessories from Image 2.

− The transition between the neck and the clothing has a slightly unnatural sharpness.
− The skin visibility is limited to the face, missing the opportunity to show skin patterns on the hands.

Verdict: GPT Image 2 is the winner as it followed the negative constraints much better, keeping the subject's face and hair identical to the source image while correctly applying the new clothing. GPT Image 1 Mini generated a high-quality image that captured the essence of the person and the outfit, but it essentially created a new person and changed the primary orientation of the head, violating the prompt's source preservation requirements.

The Capybara Taxi Driver

Text-to-Image

“Photorealistic scene inside a yellow New York taxi at night. A capybara is driving, wearing a yellow taxi driver cap and a dark jacket. It has a calm, professional expression and both front paws on the steering wheel. In the back seat sits a human businesswoman in a coat, looking at her phone with a completely normal, bored expression (as if this is just another normal ride). Through the windows you can see the streets of Manhattan at night with blurred lights. Realistic taxi interior, photorealistic, detailed fur and fabric, 35mm lens, night lighting with reflections, shallow depth of field.”

GPT Image 1 Mini

GPT Image 2

AI Judge Analysis

GPT Image 1 Mini

+ Excellent photorealism in the capybara's fur and the lighting of the scene.
+ Strong cinematic composition with a deeper depth of field that emphasizes the character's expression.
+ Follows the prompt about the driver's jacket and cap style effectively.

− The capybara only has one hand clearly visible on the steering wheel, whereas the prompt asked for both.

GPT Image 2

+ Follows the prompt for 'both front paws on the steering wheel' more accurately.
+ The background cityscape is more recognizable as a city environment with visible shop signs and rain effects.
+ The capybara's hat includes a logical 'T' logo for taxi.

− The capybara's right paw is fused awkwardly with the steering wheel texture.
− The lighting is slightly flatter and feels less like a cinematic film still compared to the other image.

Verdict: Both models followed the prompt very well, including the specific character traits and the indifferent passenger. GPT Image 1 Mini has superior textures and lighting, creating a more convincing photorealistic feel, while GPT Image 2 adhered better to the specific instruction of having both paws on the wheel. GPT Image 1 Mini is the likely winner for its artistic execution and higher visual quality.