FLUX.2 [pro] vs GPT Image 1.5
Head-to-head across 15 challenges
FLUX.2 [pro]
26.9%
win rate
Ties
15.4%
GPT Image 1.5
57.7%
win rate
Challenge Results
Candid Street Photography
Text-to-Image“A candid street photo of an elderly Japanese man repairing a red bicycle in light rain, reflections on wet pavement, shallow depth of field, 50mm lens, natural skin texture, imperfect framing, motion blur from passing cars, cinematic but realistic, no stylization.”
AI Judge Analysis
FLUX.2 [pro]
- + Excellent photographic quality with realistic skin texture and lighting.
- + The bicycle details (chain, spokes, derailleur) are highly coherent and physically plausible.
- + Subtle and realistic inclusion of rain droplets on the bike and clothing.
- − The motion blur on the background car is very subtle, making it look almost static.
- − The framing feels a bit too clean and professional for the 'imperfect framing' request.
GPT Image 1.5
- + Strong adherence to the 'imperfect framing' and 'candid' aspects of the prompt.
- + The motion blur on the passing vehicle is more pronounced and dynamic.
- + Great atmosphere with visible rain and a tool tray adding to the storytelling.
- − Significant anatomical issues with the man's hands, which appear mangled and merged with the tire.
- − The bicycle's rear structure is physically impossible, with spokes disappearing and a strange wheel hub configuration.
- − Overall image has a slightly grainier, more processed look compared to the requested 50mm lens clarity.
Verdict: FLUX.2 [pro] produces a much higher quality image with superior technical accuracy in terms of human anatomy and mechanical bicycle parts. While GPT Image 1.5 captures the 'candid' and 'motion blur' energy of the prompt more effectively, its failure to render hands and the bike wheel correctly makes it less realistic overall.
Man and Car in California
Editing“Make a photo of the man driving the car down the California coastline”
AI Judge Analysis
FLUX.2 [pro]
- + Excellent preservation of the car's exterior design and details.
- + Seamlessly integrates the man into the driver's seat from the source photo.
- + Captures a convincing motion blur on the wheels and road.
- − The man's hair is slightly simplified compared to the source image.
- − The lighting on the man is a bit flat compared to the bright coastal surroundings.
GPT Image 1.5
- + Strong adherence to the California coastline setting with palm trees and cliffs.
- + Maintains the man's scarf and clothing textures from the source image.
- + Good high-angle composition.
- − Significant distortion of the car's interior and dashboard architecture.
- − The man's hand on the steering wheel has anatomical issues (merged fingers and awkward grip).
- − Loss of the iconic Rolls-Royce Spirit of Ecstasy hood ornament and front-end presence.
Verdict: FLUX.2 [pro] is the clear winner as it successfully combines both source images into a single, coherent scene while maintaining the integrity of the car's design. GPT Image 1.5 struggles with anatomical details in the hands and heavily alters the car's interior, making it look like a generic convertible rather than the specific model provided.
Fantasy Warrior
Text-to-Image“Close portrait of a battle-worn paladin in ornate engraved plate armor, hair braided with small beads, faint scars and dirt on the skin, warm torchlight reflecting off metal, shallow depth of field, bokeh sparks, lifelike eyes, highly detailed texture on leather straps and cloth underlayer.”
AI Judge Analysis
FLUX.2 [pro]
- + Excellent depiction of colored beads in the braids
- + The engraving on the armor is deep and crisp
- + Strong lifelike quality in the skin texture and eyes
- − The scars look a bit like surface paint rather than physical tissue damage
- − The lighting on the leather straps is slightly uniform
GPT Image 1.5
- + Superior battle-worn aesthetic with realistic skin moisture and grit
- + Armor engraving and cross detail are highly intricate and feel aged
- + Excellent depth of field with better integration of bokeh sparks
- − The beads in the hair are less prominent and colorful compared to Model A
- − Slightly less 'paladin' feel compared to the heavy plate look of Sample A
Verdict: Both models followed the prompt exceptionally well, but GPT Image 1.5 wins due to its more realistic 'battle-worn' texture, featuring subtle skin moisture and integrated grit that feels authentic. While FLUX.2 [pro] did a better job with the specific request for beads in the hair, GPT Image 1.5 offered a more cinematic composition with superior lighting and material interaction.
Modern Clean Menu
Text-to-Image“Modern minimalist restaurant menu design, white background with colorful food photos in grid, sections for appetizers/pizza/mains, bold sans-serif fonts, vibrant accents, clean professional layout for casual dining.”
AI Judge Analysis
FLUX.2 [pro]
- + Excellent photographic quality and lighting in the food images.
- + Sophisticated use of typography and icons.
- + Clean, professional presentation as a physical menu mockup.
- − Significant text errors including 'MINS' instead of 'MAINS' and repetitive menu items under different headings.
- − Food photos do not match the labels (e.g., steak labeled as pizza).
- − The pricing is confusingly formatted with commas and unrealistic values.
GPT Image 1.5
- + Excellent adherence to all prompt elements including categories and grid layout.
- + Perfect text rendering for both titles and item descriptions.
- + High logical consistency between food images and their corresponding labels.
- − Composition feels a bit more like a web UI than a physical printed menu.
- − Slightly less 'premium' feel in the food photography compared to the competitor.
Verdict: While FLUX.2 [pro] produces very high-quality aesthetic visuals and a professional mockup feel, it fails significantly on the actual content logic, mislabeling food items and misspelling the 'Mains' category. GPT Image 1.5 is the superior choice because it provides a perfectly functional, readable menu with accurate text, appropriate pricing, and high logical consistency between the grid photos and the menu sections.
Geometric Composition
Text-to-Image“A glass cube on a wooden table. Inside the cube is a small blue sphere. On top of the cube sits a red book. A green plant is behind the cube, partially visible through the glass. Soft window light from the left.”
AI Judge Analysis
FLUX.2 [pro]
- + Perfect adherence to the positional and spatial requirements of the prompt.
- + Superior photorealistic quality with natural depth of field and soft lighting.
- + Accurate glass physics, including subtle refractions of the table and background.
- − The sphere is slightly matte rather than gloss (minor preference).
GPT Image 1.5
- + Excellent color vibrance and texture on the red book.
- + Good clarity and sharp focus on the central objects.
- − The bottom of the cube appears to be a mirror rather than clear glass as requested.
- − The sphere reflection inside the base is physically inconsistent with the glass material.
- − Objects appear slightly more 'rendered' and less photographic than Model A.
Verdict: FLUX.2 [pro] followed the prompt more accurately, particularly in the rendering of the glass cube's transparency and the relationship between the objects. While GPT Image 1.5 produced a vibrant image, it turned the bottom of the cube into a mirror, whereas FLUX.2 [pro] maintained a realistic, consistent glass appearance throughout the scene.
Bald man challenge
Editing“Give the person a full, thick head of natural hair with realistic texture, density, and a natural hairline. Preserve facial features and lighting.”
AI Judge Analysis
FLUX.2 [pro]
- + Successfully added thick, voluminous hair
- + Excellent blending of hair into the sideburns
- + Maintains high resolution and realistic hair texture
- − Significantly altered the subject's facial structure, especially around the eyes and brow
- − The hair color doesn't perfectly match the existing beard color
GPT Image 1.5
- + Perfect preservation of the original facial features and bone structure
- + Hair texture and color match the beard and overall aesthetic of the source image much better
- + Realistic hairline and blending with the forehead
- − The hair volume is slightly more modest compared to Model A's interpretation of 'full and thick'
Verdict: While FLUX.2 [pro] provided a very thick head of hair, it failed the preservation requirement by significantly altering the person's face. GPT Image 1.5 successfully added a natural, realistic head of hair that perfectly matches the existing beard while keeping the identity of the person in the source image identical.
Night Sky Transformation
Editing“Change the scene to night: a deep, dark sky with subtle, glistening stars visible behind the mountain.”
AI Judge Analysis
FLUX.2 [pro]
- + Perfectly preserves the layout and house structures of the source image.
- + Correctly removes the sunset glow from the mountain peak.
- + Subtle and realistic star field implementation.
GPT Image 1.5
- + Very impressive and atmospheric star field/milky way effect.
- + Successful transformation to a dark night mood.
- − Loses a significant amount of detail in the foreground village.
- − Slightly alters the mountain's shape and snow patterns compared to the source.
Verdict: FLUX.2 [pro] is the winner because it successfully transforms the sky to night while maintaining near-perfect pixel-level consistency with the source image's foreground and architecture. GPT Image 1.5 creates a beautiful starry sky, but it loses too much detail in the village and changes the structural details of the scene which should have been preserved.
Adorable Baby Animals in Sunny Meadow
Text-to-Image“Hyper-photorealistic scene of fluffy baby animals—a golden retriever puppy, tabby kitten, baby bunny, and red fox kit—with big expressive eyes and ultra-detailed soft fur, playfully chasing butterflies and tumbling together in a lush wildflower meadow, warm golden sunrise light with god rays and dew sparkles, joyful wholesome vibe, 8K masterpiece.”
AI Judge Analysis
FLUX.2 [pro]
- + Excellent anatomical details on the kitten and fox paws.
- + Superior lighting with realistic dew sparkles and gentle god rays.
- + Clean, artistic composition that feels less cluttered.
- − Failed to include the baby bunny requested in the prompt.
- − The kitten's stripes look slightly artificial/painted on.
GPT Image 1.5
- + Included all four requested animals: puppy, kitten, bunny, and fox.
- + Captured the 'tumbling together' action more effectively than the other model.
- + Vibrant golden hour lighting with strong god rays.
- − Noticeable anatomical errors, such as the kitten having three hind legs/paws in the air.
- − The fox's face/mouth area is a bit messy and less defined.
- − Higher level of visual noise and over-sharpening compared to Model A.
Verdict: FLUX.2 [pro] produced a much cleaner and more aesthetically pleasing image with better lighting and textures, but it completely missed the bunny. GPT Image 1.5 adhered better to the prompt by including all characters and the requested tumbling action, but the image contains significant anatomical glitches and lacks the refined polish of the first model. FLUX.2 [pro] is preferred for its technical quality despite the missing element, as GPT Image 1.5's errors are quite distracting.
Over-the-top cartoon caricature
Editing“Create a caricature of me and my job. Make it exaggerated and humorous, incorporating my profession as a tv show anchor and my love for dogs and hockey.”
AI Judge Analysis
FLUX.2 [pro]
- + Perfectly captures the caricature style with simplified cartoon features.
- + Excellent integration of all themes into a cohesive TV studio set.
- + Cleverly includes ice rinks on the desk and background screens.
- − The facial likeness to the source image is somewhat reduced by the generic cartoon style.
- − The character's hands are a bit blocky and awkwardly proportioned.
GPT Image 1.5
- + Stronger facial likeness to the source person within the caricature style.
- + High level of detail in the hair and background elements.
- + Humorous and creative inclusion of a dog wearing a hockey helmet.
- − The 'NEWS' microphone head is oddly shaped and slightly detached from the handle.
- − The hand holding the papers has an extra finger/anatomical confusion.
Verdict: Both models followed the instructions well, but GPT Image 1.5 is the winner for its superior ability to maintain the facial likeness of the source subject while applying the caricature effect. While FLUX.2 [pro] created a very clean cartoon scene, it felt more like a generic character, whereas GPT Image 1.5 captured the specific smile and features of the woman in the original photo alongside a funnier 'hockey dog.'
Heroic Super Hero Portrait
Text-to-Image“Hyper-photorealistic full-body portrait of a female superhero standing triumphantly on a New York skyscraper rooftop at golden sunset, wearing a classic modest superhero costume with flowing cape, chest emblem, gloves, and boots in red and blue colors, practical design, short hair, strong determined heroic expression looking into the distance, powerful confident stance with hands on hips and cape billowing dramatically in the wind, detailed urban cityscape background, warm natural sunlight with sharp shadows and fabric highlights, ultra-sharp textures on suit, hair, and concrete, 8K masterpiece, empowering family-friendly style.”
AI Judge Analysis
FLUX.2 [pro]
- + Excellent full-body composition with strong perspective and scale.
- + Highly detailed, modern armored superhero suit design.
- + Natural integration of the character into the lighting and atmosphere of the scene.
- − The chest emblem is slightly muddy and less distinct.
- − The pose is a bit stiff in the lower body.
GPT Image 1.5
- + Very clear and bold 'classic' suit design with a distinct emblem.
- + Accurate short-haired character portrait with high facial clarity.
- + Very dramatic lighting and vibrant color palette.
- − The anatomy of the legs is slightly disproportionate and lacks the 'practical' feel requested.
- − The rendering looks a bit more like a composite photo than a seamless shot.
- − The cape attachment to the shoulders looks awkward and lacks physical weight.
Verdict: FLUX.2 [pro] produces a much more realistic and 'practical' superhero design that feels grounded in the environment, with superior textures and lighting. GPT Image 1.5 captures the classic comic book aesthetic and short hair prompt better but suffers from slightly uncanny anatomy in the legs and a less convincing integration into the background. FLUX.2 [pro] is the winner for its better adherence to the 'hyper-photorealistic' and 'practical' requirements.
Neutral Expression to Genuine Smile
Editing{
"action": "image_edit",
"reference": "uploaded neutral portrait",
"change": "Warm genuine Duchenne smile: lips curved up, slight natural teeth, soft eye crinkles, subtle cheek raise",
"details": "Realistic smiling skin (dimples if present, soft cheek shadows), slightly brighter eyes; keep exact eye shape/color/iris",
"preserve_exact": "Face identity/structure, eyes/nose/lips/eyebrows, hair, skin texture/pores/freckles, makeup, clothing, head pose, background, lighting, shadows, framing",
"no_changes": "No face shape change, no new features, no gaze shift, no hair/clothing/lighting/background edits",
"style": "Ultra-photorealistic 8K portrait, sharp face focus, natural soft lighting, realistic skin glow"
}
AI Judge Analysis
FLUX.2 [pro]
- + Natural Duchenne smile with believable eye crinkling and cheek raising.
- + Excellent preservation of skin texture, freckles, and general lighting color.
- + Maintains the subject's identity and hair structure very closely.
- − The eyes look slightly more 'squinted' and less bright compared to the source.
- − Small loss of detail in the iris texture compared to the source.
GPT Image 1.5
- + Strong preservation of the original eye shape, iris detail, and brightness.
- + Successful implementation of the requested smile and natural teeth.
- + Maintains high-frequency skin details and freckle patterns perfectly.
- − The smile looks slightly less 'Duchenne' than Model A, with fewer micro-expressions around the eyes.
- − Minor softening of the skin on the cheeks.
Verdict: Both models performed excellently, maintaining the subject's identity and original image characteristics almost perfectly. FLUX.2 [pro] captured a slightly more authentic Duchenne smile with better cheek and eye-corner involvement, whereas GPT Image 1.5 did a superior job of preserving the exact brightness and intricate detail of the irises.
Studio Ghibli Anime Style
Editing“Transform this photo into a Studio Ghibli–inspired illustration. Use soft pastel colors, hand-painted textures, gentle lighting, dreamy backgrounds, and a warm, nostalgic mood”
AI Judge Analysis
FLUX.2 [pro]
- + Perfectly captures the Studio Ghibli art style with clean linework and watercolor textures.
- + Maintains excellent structural preservation of the original meme characters' poses and clothing.
- + The background adds Ghibli-esque details like Japanese signage and a hand-painted town aesthetic.
- − The color palette is slightly desaturated compared to the 'lush' look sometimes associated with Ghibli.
GPT Image 1.5
- + Excellent use of warm, dreamy lighting and soft pastel colors as requested.
- + Captures the emotional essence of the prompt through a shimmering, nostalgic atmosphere.
- − The art style leans more toward generic modern 'shoujo' anime rather than the specific Ghibli aesthetic.
- − Significant loss of detail and clarity due to excessive glow and soft-focus effects.
Verdict: FLUX.2 [pro] is the clear winner as it masterfully replicates the specific Studio Ghibli art style while retaining the unmistakable composition and character details of the original meme. GPT Image 1.5 provides a beautiful, dreamy atmosphere, but it fails to capture the distinct Ghibli linework and character design, resulting in a more generic anime look.
Golden Hour Stroll
Editing“Add dynamic motion to this photo: make hair blow in the wind, add leaves flying, energetic and lively feel.”
AI Judge Analysis
FLUX.2 [pro]
- + Successfully added wind effect to hair while keeping the face recognizable.
- + Leaves are integrated into the foreground, including some overlapping the dog.
- + Good preservation of the original bridge and background elements.
- − The hair edit creates some jagged artifacts against the sky.
- − A leaf is awkwardly sticking out of the dog's mouth.
GPT Image 1.5
- + Excellent addition of many dynamic, semi-blurred leaves for a sense of movement.
- + Very natural hair-in-the-wind effect with fine strands.
- + High level of consistency with the original background and lighting.
- − Tiny alteration to the smile/facial features compared to the original.
Verdict: Both models followed the instructions well, adding blowing hair and flying leaves. FLUX.2 [pro] placed many leaves in the immediate foreground, but some feel static or poorly placed (like the leaf in the dog's mouth). GPT Image 1.5 created a much more 'energetic and lively' feel by using different sizes and slight motion blurs on the leaves, along with a more realistic wind effect on the hair.
Vintage Cafe Logo
Text-to-Image“Vintage minimalist restaurant logo for "Caffè Florian", retro cloche dome with steam and "Est. 1720" banner, classic typography, warm brown and cream tones, subtle texture on light background, vector emblem style.”
AI Judge Analysis
FLUX.2 [pro]
- + Excellent adherence to the 'light background' and 'vector emblem' prompt requirements.
- + Perfect text rendering for both 'Caffè Florian' and 'Est. 1720'.
- + Clean, balanced composition with professional minimalist aesthetic.
- − The steam effect is a bit abstract compared to a natural plume.
GPT Image 1.5
- + Strong vintage feel with high-quality shading on the cloche.
- + Accurately captured all text elements and the banner style.
- − Ignored the 'light background' requirement by using a black background.
- − The composition feels slightly bottom-heavy with the large banner.
Verdict: FLUX.2 [pro] followed the prompt more accurately, specifically adhering to the request for a light background and a minimalist vector style. While GPT Image 1.5 produced a detailed and visually appealing illustrative logo, it failed the background color requirement and feels less like a clean vector emblem than its competitor.
Apollo 11: Journey to Tranquility
Text-to-Image“Create a clean, modern vector infographic poster about the Apollo 11 mission. NASA-inspired palette (navy, white, muted red, light gray). Flat-vector style, crisp lines, consistent iconography, subtle gradients only. Steps (stop at landing): 1. Launch (Saturn Vicon) 2. Earth Orbit (Earth + orbit ring icon) 3. Translunar (trajectory arc icon) 4. Lunar Orbit (Moon + orbit ring icon) 5. Descent (lunar module descending icon) 6. Landing (lunar module on the surface icon) Small supporting elements (minimal text): • Crew strip: three silhouette icons with only last names: Armstrong, Aldrin, Collins. • Landing site marker: Moon pin labeled "Tranquility" only. Layout constraints: generous margins, large readable labels, clean background with subtle stars. Vector-only, print-poster look, high resolution.”
AI Judge Analysis
FLUX.2 [pro]
- + Clean, minimalist vector aesthetic that perfectly matches the 'modern infographic' request.
- + Higher text legibility for the main headings and names.
- + Sophisticated composition that uses vertical space well to tell a story.
- − The Saturn V icon for the 'Launch' step is poorly rendered and inaccurate.
- − Filler body text is gibberish.
GPT Image 1.5
- + Much better technical accuracy for icons, specifically the Saturn V and Lunar Module.
- + Stronger adherence to the NASA-inspired color palette with bold use of muted red.
- + Clearer distinction between all six requested phases of the mission.
- − The layout is quite cramped, with elements overlapping the borders and the top text cut off.
- − Inconsistent line weights and styles across different icons.
Verdict: FLUX.2 [pro] produces a more 'finished' looking piece of graphic design with elegant typography and a cohesive modern style, though its technical icons are weak. GPT Image 1.5 creates a much more accurate representation of the Apollo hardware and follows the step-by-step prompt more closely, but the final image suffers from poor framing and a cluttered composition. GPT Image 1.5 is the preferred choice for a functional infographic despite the layout issues, as its icons actually represent the subject matter correctly.
FLUX.2 [pro]
Black Forest Labs' state-of-the-art image generation model with maximum quality and speed, supporting text-to-image and multi-reference image editing with up to 4MP output
GPT Image 1.5
OpenAI's state-of-the-art image generation model with better instruction following and adherence to prompts