Kling 2.1 Standard

kling-v21-standard

Kling 2.1 Standard

kling-v21-standard

Kling 2.1 Standard image-to-video model. Creates high-quality videos from images with text prompts. Requires an input image.

Added May 29, 2025

Approx. Price

$0.140 per video

Model Type

image-to-video

Preview Examples

Settings

Generation controls available for this model.

Resolution/Aspect Options

Default Duration

2 duration options

Tunable Settings

Aspect Ratio

Select

Default

16:9

Options (3)

Landscape (16:9), Portrait (9:16), Square (1:1)

Choose between landscape (16:9), portrait (9:16), or square (1:1) orientation

CFG Scale

Number

Default

0.5

Controls how closely the generation follows the prompt (0.0-1.0)

Duration

Select

Default

Options (2)

5 seconds, 10 seconds (double price)

Length of the generated video in seconds

Negative Prompt

Text

Default

blur, distort, and low quality

What to avoid in the video (default: blur, distort, and low quality)

Benchmarks

Artificial Analysis

LMArena

Human preference benchmarks sourced from Artificial Analysis.

Image to Video

#51 / 76

ELO

1174.0

Appearances

2,924

95% CI

-10/10

Release Date 2025-05 · Matched as Kling 2.1 Standard

Artificial Analysis API

Examples

A cyberpunk girl walking through neon-lit alleyways, rain falling, close-up of her eyes, camera pans to her silhouette, smoke in the background, futuristic music vibe

A fantasy knight standing on a cliff at sunset, cape fluttering, camera zooms out to reveal vast mountain range, slow motion, golden hour lighting

The petals fall slowly around her, she gently raises the umbrella and tilts her head, her sleeves sway with the wind, a poetic moment frozen in gentle motion