Kling 2.1 Standard image-to-video model. Creates high-quality videos from images with text prompts. Requires an input image.
Added May 29, 2025
Approx. Price
$0.140 per video
Model Type
image-to-video
Preview Examples
3
Generation controls available for this model.
Resolution/Aspect Options
3
Default Duration
5
2 duration options
Tunable Settings
4
Aspect Ratio
Default
16:9
Options (3)
Landscape (16:9), Portrait (9:16), Square (1:1)
Choose between landscape (16:9), portrait (9:16), or square (1:1) orientation
CFG Scale
Default
0.5
Controls how closely the generation follows the prompt (0.0-1.0)
Duration
Default
5
Options (2)
5 seconds, 10 seconds (double price)
Length of the generated video in seconds
Negative Prompt
Default
blur, distort, and low quality
What to avoid in the video (default: blur, distort, and low quality)
Human preference benchmarks sourced from Artificial Analysis.
Image to Video
#51 / 76
ELO
1174.0
Appearances
2,924
95% CI
-10/10
Release Date 2025-05 · Matched as Kling 2.1 Standard
Artificial Analysis APIA cyberpunk girl walking through neon-lit alleyways, rain falling, close-up of her eyes, camera pans to her silhouette, smoke in the background, futuristic music vibe
A fantasy knight standing on a cliff at sunset, cape fluttering, camera zooms out to reveal vast mountain range, slow motion, golden hour lighting
The petals fall slowly around her, she gently raises the umbrella and tilts her head, her sleeves sway with the wind, a poetic moment frozen in gentle motion