Claude Opus 4.5 is the top model for coding, agentic workflows, and computer use—now stronger at deep research, slides, and spreadsheets. It pushes state-of-the-art software engineering and lifts vision, reasoning, and math across the board.
Added Nov 1, 2025
Context Window
200.0K
Max Output
32.0K
Input Price
$5.00/1M
Output Price
$25.01/1M
Capabilities
Performance metrics and benchmarks
Sourced from Artificial Analysis.
Intelligence Index
43.1
Coding Index
42.9
Agentic Index
59.2
GPQA Diamond
Graduate-level scientific reasoning
81.0%
Better than 81% of models compared
HLE
Humanity's Last Exam
12.9%
Better than 78% of models compared
IFBench
Instruction-following benchmark
43.0%
Better than 50% of models compared
T²-Bench Telecom
Conversational AI agents in dual-control scenarios
86.3%
Better than 81% of models compared
AA-LCR
Long context reasoning evaluation
65.3%
Better than 87% of models compared
GDPval-AA
Economically valuable tasks
46.1%
Better than 92% of models compared
CritPt
Research-level physics reasoning
0.3%
Better than 64% of models compared
SciCode
Python programming for scientific computing
47.0%
Better than 94% of models compared
Terminal-Bench Hard
Agentic coding and terminal use
40.9%
Better than 91% of models compared
AIME 2025
American Invitational Mathematics Examination 2025
62.7%
Better than 59% of models compared
MMLU-Pro
Professional and academic subject knowledge
88.9%
Better than 99% of models compared
AA-Omniscience Accuracy
Proportion of correctly answered questions
40.7%
Better than 94% of models compared
Last updated May 11, 2026
Artificial AnalysisLiveCodeBench
Contamination-free coding benchmark
73.8%
Better than 85% of models compared
AA-Omniscience Hallucination Rate
Rate of incorrect answers among non-correct responses
75.4%
Better than 70% of models compared