# FLUX.1 [schnell] FP8

> FP8 quantized variant of Black Forest Labs' FLUX.1 [schnell] model, offering ~2x faster inference with reduced precision while maintaining high-quality image generation in 4 steps

## Quick Reference

- Model ID: flux.1-schnell-fp8
- Creator: Black Forest Labs
- Status: active
- Family: flux.1
- Base URL: https://api.lumenfall.ai/openai/v1

## Specifications

- Max Resolution: 1024x1024
- Input Modalities: text
- Output Modalities: image

## Model Identifiers

- Primary Slug: flux.1-schnell-fp8

## Dates

- Released: October 2024

## Tags

image-generation, text-to-image, fast, open-weights, quantized

## Available Providers

### Fireworks AI

- Config Key: fireworks/flux.1-schnell-fp8
- Provider Model ID: accounts/fireworks/models/flux-1-schnell-fp8/text_to_image
- Regions: global
- Pricing:
  - notes: ["Free to try", "Normally priced at $0.00035 per inference step", "FLUX.1 [schnell] uses 4 steps by default, making the effective per-image cost $0.0014", "FP8 variant uses reduced precision for ~2x faster inference"]
  - source: official
  - currency: USD
  - components: [{"type" => "output", "metric" => "image", "unit_price" => 0}]
  - source_url: https://fireworks.ai/pricing


## Performance Metrics

Provider performance over the last 30 days.

### fireworks

- Median Generation Time (p50): 1769ms
- 95th Percentile Generation Time (p95): 11146ms
- Average Generation Time: 3255ms
- Success Rate: 96.0%
- Total Requests: 3236
- Time to First Byte (p50): 962ms
- Time to First Byte (p95): 4736ms


## Image Gallery

1 images available for this model.
- Curated examples: 1
  - "Cinematic wide shot of a high-end, minimalist boutique storefront at dusk. The shop's large glass window reveals a wa..."

## Example Prompt

The following prompt was used to generate an example image in our playground:

A cozy street-side flower shop with a large chalkboard sign that reads "FRESH BLOOMS & WILD SUNFLOWERS" in elegant cursive. A golden retriever sits by the door, while a small capybara rests quietly behind a bucket of tulips in the background.

## Code Examples

### Text to Image (Generation)

#### cURL

curl -X POST \
  https://api.lumenfall.ai/openai/v1/images/generations \
  -H "Authorization: Bearer $LUMENFALL_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "flux.1-schnell-fp8",
    "prompt": "A serene mountain landscape at sunset",
    "size": "1024x1024"
  }'

# Response:
# { "created": 1234567890, "data": [{ "url": "https://...", "revised_prompt": "..." }] }

#### JavaScript

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: 'YOUR_API_KEY',
  baseURL: 'https://api.lumenfall.ai/openai/v1'
});

const response = await client.images.generate({
  model: 'flux.1-schnell-fp8',
  prompt: 'A serene mountain landscape at sunset',
  size: '1024x1024'
});

// { created: 1234567890, data: [{ url: "https://...", revised_prompt: "..." }] }
console.log(response.data[0].url);

#### Python

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://api.lumenfall.ai/openai/v1"
)

response = client.images.generate(
    model="flux.1-schnell-fp8",
    prompt="A serene mountain landscape at sunset",
    size="1024x1024"
)

# { created: 1234567890, data: [{ url: "https://...", revised_prompt: "..." }] }
print(response.data[0].url)


## About

## Overview
FLUX.1 [schnell] FP8 is a quantized version of Black Forest Labs’ distilled text-to-image model, optimized for maximum inference speed. By utilizing 8-bit floating-point precision, this variant achieves significantly lower latency and reduced memory overhead compared to the standard model. It is specifically designed for high-throughput applications where generating competitive imagery in a handful of steps is the primary requirement.

## Strengths
*   **Generation Speed:** Produces usable 1024x1024 images in just 1 to 4 sampling steps, making it one of the fastest high-resolution open-weight models available.
*   **Standardized Resource Efficiency:** The FP8 quantization reduces the VRAM footprint and computational load, allowing for roughly 2x faster inference times compared to the full-precision version without a proportional loss in visual quality.
*   **Prompt Adherence:** Despite the lowered precision and distillation, the model retains the architectural ability to follow complex descriptive prompts and render legible, coherent text within images.
*   **Output Consistency:** It maintains the structural integrity and composition characteristic of the FLUX.1 family, even at extremely low step counts.

## Limitations
*   **Artistic Nuance:** Due to the distillation and quantization, it offers less stylistic flexibility and fine-grained detail compared to the [dev] or [pro] iterations of FLUX.1.
*   **Precision Loss:** FP8 quantization can occasionally lead to minor artifacts or less smooth gradients in complex lighting scenarios that would be better handled by 16-bit or 32-bit models.
*   **Step Sensitivity:** The model is strictly tuned for low-step counts; increasing the sampling steps beyond the recommended range usually yields diminishing returns or visual regressions.

## Technical Background
FLUX.1 [schnell] is a latent diffusion model based on a flow-based transformer architecture. This specific FP8 variant applies post-training quantization to the model weights, mapping them to 8-bit precision to optimize throughput on modern hardware. The "schnell" version itself is the result of a performance-oriented distillation process, allowing the model to reach a converged image state in a fraction of the time required by standard diffusion processes.

## Best For
This model is ideal for real-time applications, rapid prototyping, and high-volume image generation workflows where operational cost and latency are critical. It is a strong choice for "generate-as-you-type" interfaces or large-scale content pipelines that require decent photorealism at minimal compute expense. FLUX.1 [schnell] FP8 is available for testing and integration through Lumenfall's unified API and interactive playground.

## Frequently Asked Questions

### How much does FLUX.1 [schnell] FP8 cost?

FLUX.1 [schnell] FP8 is free to use through Lumenfall's unified API.

### How do I use FLUX.1 [schnell] FP8 via API?

You can use FLUX.1 [schnell] FP8 through Lumenfall's OpenAI-compatible API. Send requests to the unified endpoint with model ID "flux.1-schnell-fp8". Code examples are available in Python, JavaScript, and cURL.

### Which providers offer FLUX.1 [schnell] FP8?

FLUX.1 [schnell] FP8 is available through Fireworks AI on Lumenfall. Lumenfall automatically routes requests to the best available provider.

### What is the maximum resolution for FLUX.1 [schnell] FP8?

FLUX.1 [schnell] FP8 supports images up to 1024x1024 resolution.

## Links

- Model Page: https://lumenfall.ai/models/black-forest-labs/flux.1-schnell-fp8
- About: https://lumenfall.ai/models/black-forest-labs/flux.1-schnell-fp8/about
- Providers, Pricing & Performance: https://lumenfall.ai/models/black-forest-labs/flux.1-schnell-fp8/providers
- API Reference: https://lumenfall.ai/models/black-forest-labs/flux.1-schnell-fp8/api
- Benchmarks: https://lumenfall.ai/models/black-forest-labs/flux.1-schnell-fp8/benchmarks
- Use Cases: https://lumenfall.ai/models/black-forest-labs/flux.1-schnell-fp8/use-cases
- Gallery: https://lumenfall.ai/models/black-forest-labs/flux.1-schnell-fp8/gallery
- Playground: https://lumenfall.ai/playground?model=flux.1-schnell-fp8
- API Documentation: https://docs.lumenfall.ai