# FLUX.1 [schnell] FP8 > FP8 quantized variant of Black Forest Labs' FLUX.1 [schnell] model, offering ~2x faster inference with reduced precision while maintaining high-quality image generation in 4 steps ## Quick Reference - Model ID: flux.1-schnell-fp8 - Creator: Black Forest Labs - Status: active - Family: flux.1 - Base URL: https://api.lumenfall.ai/openai/v1 ## Specifications - Max Resolution: 1024x1024 - Input Modalities: text - Output Modalities: image ## Model Identifiers - Primary Slug: flux.1-schnell-fp8 ## Dates - Released: October 2024 ## Tags image-generation, text-to-image, fast, open-weights, quantized ## Available Providers ### Fireworks AI - Config Key: fireworks/flux.1-schnell-fp8 - Provider Model ID: accounts/fireworks/models/flux-1-schnell-fp8/text_to_image - Regions: global - Pricing: - notes: ["Free to try", "Normally priced at $0.00035 per inference step", "FLUX.1 [schnell] uses 4 steps by default, making the effective per-image cost $0.0014", "FP8 variant uses reduced precision for ~2x faster inference"] - source: official - currency: USD - components: [{"type" => "output", "metric" => "image", "unit_price" => 0}] - source_url: https://fireworks.ai/pricing ## Performance Metrics Provider performance over the last 30 days. ### fireworks - Median Generation Time (p50): 1769ms - 95th Percentile Generation Time (p95): 11146ms - Average Generation Time: 3255ms - Success Rate: 96.0% - Total Requests: 3236 - Time to First Byte (p50): 962ms - Time to First Byte (p95): 4736ms ## Image Gallery 1 images available for this model. - Curated examples: 1 - "Cinematic wide shot of a high-end, minimalist boutique storefront at dusk. The shop's large glass window reveals a wa..." ## Example Prompt The following prompt was used to generate an example image in our playground: A cozy street-side flower shop with a large chalkboard sign that reads "FRESH BLOOMS & WILD SUNFLOWERS" in elegant cursive. A golden retriever sits by the door, while a small capybara rests quietly behind a bucket of tulips in the background. ## Code Examples ### Text to Image (Generation) #### cURL curl -X POST \ https://api.lumenfall.ai/openai/v1/images/generations \ -H "Authorization: Bearer $LUMENFALL_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "flux.1-schnell-fp8", "prompt": "A serene mountain landscape at sunset", "size": "1024x1024" }' # Response: # { "created": 1234567890, "data": [{ "url": "https://...", "revised_prompt": "..." }] } #### JavaScript import OpenAI from 'openai'; const client = new OpenAI({ apiKey: 'YOUR_API_KEY', baseURL: 'https://api.lumenfall.ai/openai/v1' }); const response = await client.images.generate({ model: 'flux.1-schnell-fp8', prompt: 'A serene mountain landscape at sunset', size: '1024x1024' }); // { created: 1234567890, data: [{ url: "https://...", revised_prompt: "..." }] } console.log(response.data[0].url); #### Python from openai import OpenAI client = OpenAI( api_key="YOUR_API_KEY", base_url="https://api.lumenfall.ai/openai/v1" ) response = client.images.generate( model="flux.1-schnell-fp8", prompt="A serene mountain landscape at sunset", size="1024x1024" ) # { created: 1234567890, data: [{ url: "https://...", revised_prompt: "..." }] } print(response.data[0].url) ## About ## Overview FLUX.1 [schnell] FP8 is a quantized version of Black Forest Labs’ distilled text-to-image model, optimized for maximum inference speed. By utilizing 8-bit floating-point precision, this variant achieves significantly lower latency and reduced memory overhead compared to the standard model. It is specifically designed for high-throughput applications where generating competitive imagery in a handful of steps is the primary requirement. ## Strengths * **Generation Speed:** Produces usable 1024x1024 images in just 1 to 4 sampling steps, making it one of the fastest high-resolution open-weight models available. * **Standardized Resource Efficiency:** The FP8 quantization reduces the VRAM footprint and computational load, allowing for roughly 2x faster inference times compared to the full-precision version without a proportional loss in visual quality. * **Prompt Adherence:** Despite the lowered precision and distillation, the model retains the architectural ability to follow complex descriptive prompts and render legible, coherent text within images. * **Output Consistency:** It maintains the structural integrity and composition characteristic of the FLUX.1 family, even at extremely low step counts. ## Limitations * **Artistic Nuance:** Due to the distillation and quantization, it offers less stylistic flexibility and fine-grained detail compared to the [dev] or [pro] iterations of FLUX.1. * **Precision Loss:** FP8 quantization can occasionally lead to minor artifacts or less smooth gradients in complex lighting scenarios that would be better handled by 16-bit or 32-bit models. * **Step Sensitivity:** The model is strictly tuned for low-step counts; increasing the sampling steps beyond the recommended range usually yields diminishing returns or visual regressions. ## Technical Background FLUX.1 [schnell] is a latent diffusion model based on a flow-based transformer architecture. This specific FP8 variant applies post-training quantization to the model weights, mapping them to 8-bit precision to optimize throughput on modern hardware. The "schnell" version itself is the result of a performance-oriented distillation process, allowing the model to reach a converged image state in a fraction of the time required by standard diffusion processes. ## Best For This model is ideal for real-time applications, rapid prototyping, and high-volume image generation workflows where operational cost and latency are critical. It is a strong choice for "generate-as-you-type" interfaces or large-scale content pipelines that require decent photorealism at minimal compute expense. FLUX.1 [schnell] FP8 is available for testing and integration through Lumenfall's unified API and interactive playground. ## Frequently Asked Questions ### How much does FLUX.1 [schnell] FP8 cost? FLUX.1 [schnell] FP8 is free to use through Lumenfall's unified API. ### How do I use FLUX.1 [schnell] FP8 via API? You can use FLUX.1 [schnell] FP8 through Lumenfall's OpenAI-compatible API. Send requests to the unified endpoint with model ID "flux.1-schnell-fp8". Code examples are available in Python, JavaScript, and cURL. ### Which providers offer FLUX.1 [schnell] FP8? FLUX.1 [schnell] FP8 is available through Fireworks AI on Lumenfall. Lumenfall automatically routes requests to the best available provider. ### What is the maximum resolution for FLUX.1 [schnell] FP8? FLUX.1 [schnell] FP8 supports images up to 1024x1024 resolution. ## Links - Model Page: https://lumenfall.ai/models/black-forest-labs/flux.1-schnell-fp8 - About: https://lumenfall.ai/models/black-forest-labs/flux.1-schnell-fp8/about - Providers, Pricing & Performance: https://lumenfall.ai/models/black-forest-labs/flux.1-schnell-fp8/providers - API Reference: https://lumenfall.ai/models/black-forest-labs/flux.1-schnell-fp8/api - Benchmarks: https://lumenfall.ai/models/black-forest-labs/flux.1-schnell-fp8/benchmarks - Use Cases: https://lumenfall.ai/models/black-forest-labs/flux.1-schnell-fp8/use-cases - Gallery: https://lumenfall.ai/models/black-forest-labs/flux.1-schnell-fp8/gallery - Playground: https://lumenfall.ai/playground?model=flux.1-schnell-fp8 - API Documentation: https://docs.lumenfall.ai