Generate a lip-synced music video from an audio track, with optional reference portraits (1-3 images). Supports cinematic scene transitions up to 10 minutes at 480p or 720p.
Added Apr 14, 2026
Approx. Price
$0.150 per video
Model Type
both
Preview Examples
0
Generation controls available for this model.
Resolution/Aspect Options
2
Default Duration
N/A
Tunable Settings
3
Aspect Ratio
Default
16:9
Options (2)
Landscape (16:9), Portrait (9:16)
Output aspect ratio.
Prompt
Default
N/A
Optional style/scene direction for the music video.
Resolution
Default
480p
Options (2)
480p, 720p
Output resolution.
Human preference benchmarks sourced from Artificial Analysis.
No Artificial Analysis benchmark data is available yet for this model.
Artificial Analysis API