DeepSeek V4 Flash Thinking enables DeepSeek's reasoning mode on the efficiency-optimized Mixture-of-Experts model with a 1M-token context window, built for fast inference, high-throughput workloads, reasoning, coding, and agent workflows.
Added Apr 24, 2026
Context Window
1.0M
Max Output
384.0K
Input Price
$0.14/1M
Output Price
$0.28/1M
Capabilities
Performance metrics and benchmarks
Sourced from Artificial Analysis.
Intelligence Index
46.5
Coding Index
38.7
Agentic Index
61.3
GPQA Diamond
Graduate-level scientific reasoning
89.4%
Better than 96% of models compared
HLE
Humanity's Last Exam
32.1%
Better than 96% of models compared
IFBench
Instruction-following benchmark
79.2%
Better than 98% of models compared
T²-Bench Telecom
Conversational AI agents in dual-control scenarios
95.0%
Better than 95% of models compared
AA-LCR
Long context reasoning evaluation
63.0%
Better than 83% of models compared
GDPval-AA
Economically valuable tasks
45.7%
Better than 91% of models compared
CritPt
Research-level physics reasoning
7.1%
Better than 93% of models compared
SciCode
Python programming for scientific computing
44.9%
Better than 92% of models compared
Terminal-Bench Hard
Agentic coding and terminal use
35.6%
Better than 87% of models compared
AA-Omniscience Accuracy
Proportion of correctly answered questions
37.2%
Better than 91% of models compared
AA-Omniscience Hallucination Rate
Rate of incorrect answers among non-correct responses
89.7%
Better than 25% of models compared
Last updated May 11, 2026
Artificial Analysis