Gemini 3.1 Pro preview is built for tasks where simple answers are not enough. Stronger core reasoning for complex coding, math, and long-context workflows, with multimodal support and a reported 77.1% verified score on ARC-AGI-2. NOTE: Inputs > 200k tokens are charged at 2x input and 1.5x output rates.
Added Feb 19, 2026
Context Window
1.0M
Max Output
65.5K
Input Price
$2.00/1M
Output Price
$12.00/1M
Capabilities
Performance metrics and benchmarks
Sourced from Artificial Analysis.
Intelligence Index
57.2
Coding Index
55.5
Agentic Index
59.1
GPQA Diamond
Graduate-level scientific reasoning
94.1%
Better than 99% of models compared
HLE
Humanity's Last Exam
44.7%
Better than 99% of models compared
IFBench
Instruction-following benchmark
77.1%
Better than 97% of models compared
T²-Bench Telecom
Conversational AI agents in dual-control scenarios
95.6%
Better than 96% of models compared
AA-LCR
Long context reasoning evaluation
72.7%
Better than 97% of models compared
GDPval-AA
Economically valuable tasks
40.7%
Better than 86% of models compared
CritPt
Research-level physics reasoning
17.7%
Better than 98% of models compared
SciCode
Python programming for scientific computing
58.9%
Better than 99% of models compared
Terminal-Bench Hard
Agentic coding and terminal use
53.8%
Better than 98% of models compared
AA-Omniscience Accuracy
Proportion of correctly answered questions
55.2%
Better than 99% of models compared
AA-Omniscience Hallucination Rate
Rate of incorrect answers among non-correct responses
49.9%
Better than 89% of models compared
Last updated May 11, 2026
Artificial Analysis