“Hyper-photorealistic scene of fluffy baby animals—a golden retriever puppy, tabby kitten, baby bunny, and red fox kit—with big expressive eyes and ultra-detailed soft fur, playfully chasing butterflies and tumbling together in a lush wildflower meadow, warm golden sunrise light with god rays and dew sparkles, joyful wholesome vibe, 8K masterpiece.”
Google's latest Imagen 4.0 text-to-image generation model with significantly better text rendering and overall image quality
Details
imagen-4.0-generate-001
Ready to integrate?
Access imagen-4.0-generate-001 via our unified API.
Starting from
Prices shown are in USD
See all providersProviders & Pricing (4)
Imagen 4.0 Generate 001 is available from 4 providers, with per-image pricing starting at $0.04 through fal.ai.
fal/imagen-4.0-generate-001
gemini/imagen-4.0-generate-001
replicate/imagen-4.0-generate-001
vertex/imagen-4.0-generate-001
Imagen 4.0 Generate 001 API OpenAI-compatible
Integrate Imagen 4.0 Generate 001 into your workflow via Lumenfall's OpenAI-compatible API to generate high-quality images and precise text-in-image renders.
https://api.lumenfall.ai/openai/v1
imagen-4.0-generate-001
Code Examples
Text to Image
/v1/images/generationscurl -X POST \
https://api.lumenfall.ai/openai/v1/images/generations \
-H "Authorization: Bearer $LUMENFALL_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "imagen-4.0-generate-001",
"prompt": "",
"size": "1024x1024"
}'
# Response:
# { "created": 1234567890, "data": [{ "url": "https://...", "revised_prompt": "..." }] }
import OpenAI from 'openai';
const client = new OpenAI({
apiKey: 'YOUR_API_KEY',
baseURL: 'https://api.lumenfall.ai/openai/v1'
});
const response = await client.images.generate({
model: 'imagen-4.0-generate-001',
prompt: '',
size: '1024x1024'
});
// { created: 1234567890, data: [{ url: "https://...", revised_prompt: "..." }] }
console.log(response.data[0].url);
from openai import OpenAI
client = OpenAI(
api_key="YOUR_API_KEY",
base_url="https://api.lumenfall.ai/openai/v1"
)
response = client.images.generate(
model="imagen-4.0-generate-001",
prompt="",
size="1024x1024"
)
# { created: 1234567890, data: [{ url: "https://...", revised_prompt: "..." }] }
print(response.data[0].url)
Parameter Reference
Core Parameters
| Parameter | Type | Description | Modes |
|---|---|---|---|
prompt
|
string | Required. Text prompt for image generation |
T2I
|
Size & Layout
| Parameter | Type | Description | Modes |
|---|---|---|---|
size
|
string |
Image dimensions as WxH pixels (e.g. "1024x1024") or aspect ratio (e.g. "16:9")
WxH determines both shape and scale (aspect_ratio and resolution are ignored when size is provided). W:H format is equivalent to aspect_ratio.
|
T2I
|
aspect_ratio
|
string |
Aspect ratio of the output image (e.g. "16:9", "1:1")
Controls shape independently of scale. Use with resolution to control both. If size is also provided, size takes precedence. Any ratio is accepted and mapped to the nearest supported value.
|
T2I
|
resolution
|
string |
Output resolution tier (e.g. "1K", "4K")
Controls scale independently of shape. Higher tiers produce larger images and cost more. If size is also provided, size takes precedence for scale. Any tier is accepted and mapped to the nearest supported value.
|
T2I
|
size
Exact pixel dimensions
"1920x1080"
aspect_ratio
Shape only, default scale
"16:9"
resolution
Scale tier, preserves shape
"1K"
Priority when combined
size is most specific and always wins. aspect_ratio and resolution control shape and scale independently.
How matching works
7:1 on a model with
4:1 and 8:1,
you get 8:1.
0.5K 1K 2K 4K)
or megapixel tiers (0.25 1).
If the exact tier isn't available, you get the nearest one.
Output & Format
| Parameter | Type | Description | Modes |
|---|---|---|---|
response_format
|
string |
How to return the image
url
b64_json
Default:
"url" |
T2I
|
output_format
|
string |
Output image format
png
jpeg
gif
webp
avif
Gateway converts to requested format if provider doesn't support it natively.
|
T2I
|
output_compression
|
integer | Compression level for lossy formats (JPEG, WebP, AVIF) |
T2I
|
n
|
integer |
Number of images to generate
Default:
1Gateway generates multiple images in parallel even if provider only supports 1.
|
T2I
|
Parameter Normalization
How we handle parameters across different providers
Not every provider speaks the same language. When you send a parameter, we handle it in one of four ways depending on what the model supports:
| Behavior | What happens | Example |
|---|---|---|
passthrough |
Sent as-is to the provider | style, quality |
renamed |
Same value, mapped to the field name the provider expects | prompt |
converted |
Transformed to the provider's native format | size |
emulated |
Works even if the provider has no concept of it | n, response_format |
Parameters we don't recognize pass straight through to the upstream API, so provider-specific options still work.
Imagen 4.0 Generate 001 Benchmarks
Google's Imagen 4.0 Generate 001 holds rank #27 in the Text-to-Image arena with a competitive Elo score of 1146. This model demonstrates significant improvements in text rendering accuracy and compositional fidelity over previous Google iterations.
Text-to-Image Landscape
Elo vs Cost
Elo vs Speed
8 without speed data omitted.
Competition Results
Uncategorized
Top Matchups
See how Imagen 4.0 Generate 001 performs head-to-head against other AI models, ranked by community votes in blind comparisons.
Gallery
View all 5 imagesImagen 4.0 Generate 001 FAQ
How much does Imagen 4.0 Generate 001 cost?
Imagen 4.0 Generate 001 starts at $0.04 per image through Lumenfall. Pricing varies by provider. Lumenfall does not add any markup to provider pricing.
How do I use Imagen 4.0 Generate 001 via API?
You can use Imagen 4.0 Generate 001 through Lumenfall's OpenAI-compatible API. Send requests to the unified endpoint with model ID "imagen-4.0-generate-001". Code examples are available in Python, JavaScript, and cURL.
Which providers offer Imagen 4.0 Generate 001?
Imagen 4.0 Generate 001 is available through fal.ai, Vertex AI, Replicate, and Gemini API on Lumenfall. Lumenfall automatically routes requests to the best available provider.
Overview
Imagen 4.0 Generate 001 is Google’s fourth-generation text-to-image model, designed to synthesize high-fidelity visuals from natural language descriptions. Developed by Google Research, this iteration focuses on solving long-standing hurdles in diffusion models, specifically the accurate rendering of complex typography and the adherence to detailed, multi-part prompts. It represents a significant architectural leap over the 3.0 series in terms of spatial reasoning and fine-grained detail.
Strengths
- Precise Text Rendering: The model demonstrates a high success rate when embedding specific strings, legible words, and long phrases into images, minimizing the common “gibberish” artifacts found in earlier generation models.
- Nuanced Prompt Adherence: It excels at interpreting complex instructions that involve multiple subjects, specific lighting conditions (e.g., “volumetric God rays”), and precise camera angles without merging distinct elements.
- Compositional Realism: The model exhibits improved spatial awareness, accurately placing objects in relation to one another according to prepositional commands (e.g., “behind,” “to the left of,” or “resting on”).
- High-Fidelity Textures: It produces sharp, realistic textures for challenging subjects such as human skin, woven fabrics, and reflective surfaces, reducing the “plastic” look often associated with AI-generated imagery.
Limitations
- Photorealistic Bias: While capable of various styles, the model can lean toward a “stock photo” aesthetic unless specific artistic styles or medium-specific keywords (e.g., “charcoal sketch” or “35mm film grain”) are heavily emphasized.
- Anatomical Edge Cases: Like most diffusion models, it may still struggle with extreme anatomical poses or complex overlapping of limbs in crowded scenes.
- Generation Latency: Due to the model’s increased parameter count and complexity, inference times may be slightly higher compared to “Turbo” or “Lightning” variants of competing models.
Technical Background
Imagen 4.0 is built upon an evolved transformer-based diffusion architecture, likely utilizing a massive T5-XXL text encoder to deeply understand linguistic semantics before the image synthesis phase begins. This version incorporates a more robust training dataset focused on high-descriptive captions and high-resolution aesthetics. Key technical refinements were made to the sampling process to ensure that textural details remain coherent even at the edges of the frame.
Best For
- Marketing and Ad Copy: Creating hero images that require integrated legible text, such as signs, storefronts, or branded packaging.
- Concept Art: Generating detailed character designs and environments that require strict adherence to specific stylistic and spatial prompts.
- UI/UX Prototyping: Visualizing app interfaces and website layouts where text placement and icon clarity are essential.
Imagen 4.0 Generate 001 is available for testing and integration through Lumenfall’s unified API and interactive playground, allowing developers to compare its output alongside other industry-leading image models.
Try Imagen 4.0 Generate 001 in Playground
Generate images with custom prompts — no API key needed.