FLUX.2 [klein] 9B Black Forest Labs GPT Image 2 OpenAI

Settled by community votes across 4 shared challenges, with an AI judge weighing in on each.

FLUX.2 [klein] 9B

20.7 arena score

#11 of 23 in Image Editing

Skill signature · Image Editing

GPT Image 2

28.2 arena score

#3 of 44 in Text-to-Image

Top 3 in Text-to-Image

Vote tally

Where the votes landed

FLUX.2 [klein] 9B

40.0%

win rate

Ties

0.0%

GPT Image 2

60.0%

win rate

40.0% 0.0% ties 60.0%

Shared challenges 4

Challenge by challenge

The strongest take from each model on every shared challenge, with the AI judge's read.

Magic Burger Explosion: Fiery Photorealism Challenge

Text-to-Image

“Ad for 'Magic Burger'. Dynamic, exploded burger with all components (bun, patty, cheese, lettuce, tomato, sauce) suspended in mid-air. Emphasize photorealistic detail and a sense of motion. Dark, fiery background with glowing embers. Integrate text: 'MAGIC BURGER' as a prominent title, 'LIMITED TIME ONLY' as a secondary message, and '€6.99' in a starburst, all rendered with a fiery, glowing effect.”

FLUX.2 [klein] 9B

GPT Image 2

0% wins 0% ties 100% wins

AI Judge Analysis

FLUX.2 [klein] 9B

+ Excellent text legibility and clean graphic design layout.
+ Highly photorealistic textures on the bun and patty.
+ Well-integrated starburst graphic that matches the professional ad aesthetic.

− The 'exploded' effect is safe and less dynamic than the competitor.
− The secondary text lacks the requested fiery/glowing effect, appearing as flat white.

GPT Image 2

+ Highly dynamic 'exploded' composition with sauce droplets and flying ingredients.
+ Strong adherence to the fiery, glowing text effect throughout the image.
+ Complex layering of ingredients including onions and sauce that adds to the visual interest.

− The texture of the meat patty looks slightly over-processed or crunchy rather than juicy.
− The composition feels a bit crowded and chaotic compared to a professional advertisement.

Verdict: While FLUX.2 [klein] 9B produces a very clean and professional-looking advertisement with superior food textures, GPT Image 2 better followed the prompt's request for a dynamic 'exploded' look and applied the fiery effect to all text elements. GPT Image 2 is the winner because it captured the sense of motion and the specific stylistic requirements for the text more accurately.

Chalkboard Menu

Text-to-Image

“Handwritten-style chalkboard menu in a cozy café, all text rendered in the exact same realistic chalk handwriting style with natural variations in letter size, slight slant, and chalk texture — no printed or digital fonts anywhere on the board. Title at the top in elegant cursive chalk handwriting: ‘TODAY’S SPECIALS – APRIL 30, 2026’. Below it, three menu items also in the same handwritten chalk style: ‘Truffle Mushroom Risotto – $24’, ‘Grilled Octopus with Lemon & Herbs – $28’, ‘Brown Butter Chocolate Chip Cookies – $9’. At the very bottom, smaller text in the identical handwritten chalk style (slightly smaller but still clearly legible with the same handwriting characteristics): ‘All items made fresh daily • Ask about our gluten-free options’. Warm ambient café lighting, visible chalk dust and smudges, realistic handwriting imperfections, no clean printed text anywhere.”

FLUX.2 [klein] 9B

GPT Image 2

AI Judge Analysis

FLUX.2 [klein] 9B

+ Excellent chalk texture and realistic smudge marks on the board.
+ Strong adherence to the 'cozy café' background setting.
+ Accurately rendered the specific date requested.

− Numerous spelling errors including 'Riott', 'Risoto', 'Octoopus', and 'fress'.
− Text layout is crowded with redundant price labels for the second item.

GPT Image 2

+ Perfect spelling for every item on the menu.
+ The handwriting style is much more consistent and realistic for a chalkboard.
+ Superior composition with balanced spacing and clean letterforms.

− The lighting on the board is slightly uneven, though realistic for the setting.
− The cursive title is a bit simpler than the 'elegant' descriptor might suggest.

Verdict: GPT Image 2 is the clear winner as it successfully rendered all text with perfect spelling and a very realistic chalk handwriting style. FLUX.2 [klein] 9B struggled significantly with spelling, introducing multiple typos and a cluttered layout despite having a good chalk texture.

Pose & Character Mashup

Editing

Edit instruction

“Use Image 1 as the exact pose reference and Image 2 as the character reference. Recreate the person/character from Image 2 in the exact dynamic pose and body position from Image 1. Keep the exact face, hair, clothing style/details, and expression from Image 2. Match the lighting and environment of Image 1. The final image must show the character from Image 2 performing the precise action/pose from Image 1 with perfect anatomy and natural integration.”

Source

FLUX.2 [klein] 9B

GPT Image 2

100% wins 0% ties 0% wins

AI Judge Analysis

FLUX.2 [klein] 9B

+ Successfully replicates the exact pose and environment from Image 1.
+ Captures the character's facial features and clothing accurately.
+ Maintains the distinct scarf pattern and accessory details from Image 2.

− The character's skin tone is significantly darker than in the source image.
− The left hand is clenched in a fist rather than the open-palm pose from Image 1.

GPT Image 2

+ Excellent adherence to the pose and composition of Image 1.
+ High accuracy in recreating the facial features and skin tone of the character in Image 2.
+ Perfectly captures the clothing, sunglasses, and the specific hanging style of the scarf.

− The fingers on the raised right hand are slightly poorly rendered.

Verdict: Both models followed the complex instructions exceptionally well by mapping the character from Image 2 onto the unique pose and environment of Image 1. FLUX.2 [klein] 9B performed well but altered the character's skin tone and missed the nuance of the hand position, whereas GPT Image 2 maintained much better consistency with the character's appearance and the specific hand modeling of the original pose.

Outfit Transfer Challenge

Editing

Edit instruction

“Use Image 1 as the base person. Dress them in the exact elaborate outfit from Image 2 (including all layers, accessories, jewelry, and shoes). Carefully adapt the clothing to the body shape and pose in Image 1 while maintaining realistic fabric behavior, correct proportions, and perfect lighting/shadow matching. Keep the person’s exact face, hair, and background completely unchanged.”

Source

FLUX.2 [klein] 9B

GPT Image 2

50% wins 0% ties 50% wins

AI Judge Analysis

FLUX.2 [klein] 9B

+ Excellent adherence to the requested accessories, including the gold necklaces and bracelets from Image 2.
+ Very high level of source preservation for the person's face, hair, and specific skin details.
+ Strong lighting matching with realistic highlights on the clothing that fit the beach environment.

− Slightly altered the person's eye shape/expression compared to the original Image 1.
− The added necklaces weren't explicitly central in Image 2 but were a good creative addition based on the prompt's request for 'accessories/jewelry'.

GPT Image 2

+ Perfect preservation of the person's face, eyes, and skin textures from Image 1.
+ Highly accurate recreation of the specific plaid pattern and texture of the scarf from Image 2.
+ Excellent integration of the coat's fit onto the subject's pose.

− Missed the jewelry/accessories from Image 2 (watch, rings) mentioned in the prompt instructions.
− The transition between the neck and the shirt collar is slightly less defined than in Image A.

Verdict: Both models performed exceptionally well at this complex image editing task. FLUX.2 [klein] 9B followed the instruction for 'all accessories and jewelry' more thoroughly by adding visible necklaces and bracelets, whereas GPT Image 2 captured the precise identity and facial expression of the subject in Image 1 with slightly more fidelity. FLUX.2 [klein] 9B is the winner for its more comprehensive adherence to the specific request for layers and jewelry while still maintaining high source preservation.