Grok Imagine Video

AI Video Generation Model

Video #2 $$$ · 5¢

xAI's video generation model based on the Aurora architecture, supporting text-to-video, image-to-video, and video editing with native audio-visual synthesis at up to 720p

grok-video API Async video generation

Lumenfall provides an OpenAI-compatible API for generating 720p videos, performing image-to-video transformations, and executing video edits using Grok Imagine Video.

Base URL
https://api.lumenfall.ai/v1
Model
grok-imagine-video

Code Examples

Text to Video

/v1/videos/generations
# Step 1: Submit video generation request
VIDEO_ID=$(curl -s -X POST \
  https://api.lumenfall.ai/v1/videos \
  -H "Authorization: Bearer $LUMENFALL_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "grok-imagine-video",
    "prompt": "",
    "size": "1024x1024"
  }' | jq -r '.id')
echo "Video ID: $VIDEO_ID"
# Step 2: Poll for completion
while true; do
  RESULT=$(curl -s \
    https://api.lumenfall.ai/v1/videos/$VIDEO_ID \
    -H "Authorization: Bearer $LUMENFALL_API_KEY")
  STATUS=$(echo $RESULT | jq -r '.status')
  echo "Status: $STATUS"
  if [ "$STATUS" = "completed" ]; then
    echo $RESULT | jq -r '.output.url'
    break
  elif [ "$STATUS" = "failed" ]; then
    echo $RESULT | jq -r '.error.message'
    break
  fi
  sleep 5
done

Image to Video

/v1/videos/generations

Video to Video

/v1/videos/generations

Parameter Reference

Required Supported Not available

Core Parameters

Parameter Type Description Modes
prompt string Required. Text prompt for video generation
T2V I2V V2V
duration number Video duration in seconds
T2V I2V V2V

Size & Layout

Parameter Type Description Modes
size string Video dimensions as WxH pixels (e.g. "1920x1080") or aspect ratio (e.g. "16:9")
auto 1365x768 768x1365 1254x836 836x1254 887x1182 1024x1024 1183x887
WxH determines both shape and scale (aspect_ratio and resolution are ignored when size is provided). W:H format is equivalent to aspect_ratio.
T2V I2V V2V
aspect_ratio string Aspect ratio of the output video (e.g. "16:9", "1:1")
auto 9:16 2:3 3:4 1:1 4:3 3:2 16:9
Controls shape independently of scale. Use with resolution to control both. If size is also provided, size takes precedence. Any ratio is accepted and mapped to the nearest supported value.
T2V I2V V2V
resolution string Output resolution tier (e.g. "1K", "4K")
auto 1K
Controls scale independently of shape. Higher tiers produce larger videos and cost more. If size is also provided, size takes precedence for scale. Any tier is accepted and mapped to the nearest supported value.
T2V I2V V2V
Output size aspect_ratio + resolution
Flexible
Auto "auto" Model chooses optimal dimensions
1K 7 sizes
Output size aspect_ratio + resolution
1183 × 887 "1183x887" or "4:3" + "1K"
1024 × 1024 "1024x1024" or "1:1" + "1K"
887 × 1182 "887x1182" or "3:4" + "1K"
836 × 1254 "836x1254" or "2:3" + "1K"
1254 × 836 "1254x836" or "3:2" + "1K"
768 × 1365 "768x1365" or "9:16" + "1K"
1365 × 768 "1365x768" or "16:9" + "1K"

How these parameters work

size

Exact pixel dimensions

"1920x1080"
aspect_ratio

Shape only, default scale

"16:9"
resolution

Scale tier, preserves shape

"1K"

Priority when combined

size aspect_ratio + resolution aspect_ratio resolution

size is most specific and always wins. aspect_ratio and resolution control shape and scale independently.

How matching works

Shape matching – we pick the closest supported ratio. Ask for 7:1 on a model with 4:1 and 8:1, you get 8:1.
Scale matching – providers use different tier formats: K tiers (0.5K 1K 2K 4K) or megapixel tiers (0.25 1). If the exact tier isn't available, you get the nearest one.
Dimension clamping – if a model has pixel limits, we clamp dimensions to fit and keep the aspect ratio intact.

Output & Format

Parameter Type Description Modes
n integer Number of videos to generate
Default: 1
Gateway generates multiple videos in parallel even if provider only supports 1.
T2V I2V V2V

Additional Parameters

Parameter Type Description Modes
input_reference array Input image(s) to animate into video
T2V I2V V2V
input_video string Input video URL to transform
T2V I2V V2V

Parameter Normalization

How we handle parameters across different providers

Not every provider speaks the same language. When you send a parameter, we handle it in one of four ways depending on what the model supports:

Behavior What happens Example
passthrough Sent as-is to the provider style, quality
renamed Same value, mapped to the field name the provider expects prompt
converted Transformed to the provider's native format size
emulated Works even if the provider has no concept of it n, response_format

Parameters we don't recognize pass straight through to the upstream API, so provider-specific options still work.