The Halloween Invitation
Vote14 models were given the same prompt, and the community voted blind on which outputs looked best. How it works
While rendering simple text is no longer a real challenge for today’s SOTA models, this test shows something more interesting: how much visual taste a model has, and whether it can create a layout that feels like it came from a professional designer instead of a basic Canva template (no offense to Canva).
#1 — Seedream 4.0
Challenge Rankings
| # | Model | Elo |
|---|---|---|
| 1 | 1164 | |
| 2 | 1164 | |
| 3 | 1152 | |
| 4 | 1143 | |
| 5 | 1126 | |
| 6 | 1099 | |
| 7 | 1095 | |
| 8 | 1081 | |
| 9 | 1081 | |
| 10 | 1079 | |
| 11 | 1074 | |
| 12 | 1063 | |
| 13 | 1056 | |
| 14 | 984 |
Seedream 4.0 and Nano Banana 2 share the top spot at 1164 Elo, demonstrating that mid-tier priced models currently offer the best balance of complex layout design and text precision. The 4B-parameter FLUX.2 [klein] serves as a significant outlier, securing third place (1152 Elo) while costing 95% less and generating images three times faster than the leaderboard leaders.
Elo vs Cost
Elo vs Speed
Competitors
14 models, ranked by EloGLM-Image
Playground coming soonHighlighted Battles
The most competitive head-to-head matchups, selected by closeness and vote count.