Generate videos with audio from text or images using xAI Grok Imagine Video on Wavespeed. Supports 6s or 10s duration, 480p/720p, and multiple aspect ratios.
Added Jan 29, 2026
Approx. Price
$0.300 per video
Model Type
both
Preview Examples
2
Generation controls available for this model.
Resolution/Aspect Options
2
Default Duration
6
2 duration options
Tunable Settings
3
Aspect Ratio
Default
16:9
Options (3)
Landscape (16:9), Square (1:1), Vertical (9:16)
Text-to-video supports multiple aspect ratios. Image-to-video uses the source image framing.
Duration
Default
6
Options (2)
6 seconds, 10 seconds
Length of the generated video in seconds
Resolution
Default
720p
Options (2)
480p, 720p
Output resolution
Human preference benchmarks sourced from Artificial Analysis.
Text to Video
#6 / 83
ELO
1232.0
Appearances
6,204
95% CI
-8/8
Image to Video
#3 / 76
ELO
1327.0
Appearances
6,671
95% CI
-10/10
Release Date 2026-01 · Matched as grok-imagine-video
Artificial Analysis APIAnime schoolgirl bursting out of house door, cherry blossoms blowing, morning light, speed lines indicating rush, chibi-ready expressions, classic shojo aesthetic, vibrant colors
Text-to-Video | 6s | 720p | 16:9
Medieval knight in ornate armor walking through a mystical forest, bioluminescent plants pulsing with light, ancient stone ruins overgrown with glowing vines, over-the-shoulder camera, dark fantasy aesthetic, volumetric fog and Lumen lighting
Image-to-Video | 6s | 720p | 16:9