Evaluation record · gemini-2-5-pro

Gemini 2.5 Pro

vgemini-2.5-pro-002

Google

Modeldeprecatedlong-context1m-tokensdeep-think

Strong

About This Model

DEPRECATED: Google has scheduled Gemini 2.5 Pro for shutdown on 2026-10-16; the designated replacement is Gemini 3.1 Pro. Formerly Google's flagship (superseded by the Gemini 3.x line since late 2025), with 1M token context window (2M on select Vertex AI enterprise tiers), Deep Think mode for complex reasoning, and native multimodal capabilities. Do not start new projects on this model.

Last Evaluated: July 9, 2026

Official Website

Trust Vector Analysis

Dimension Breakdown

🚀Performance & Reliability

Strong performance for its generation with 1M context window and Deep Think mode, but superseded by the Gemini 3.x line. DEPRECATED: shutdown scheduled 2026-10-16.

task accuracy code

Standard coding benchmarks

Evidence

HumanEval — 88.4% on HumanEval

highVerified: 2026-07-09

task accuracy reasoning

PhD-level reasoning benchmarks

Evidence

GPQA Diamond — 69.8% with Deep Think mode

highVerified: 2026-07-09

task accuracy general

Comprehensive knowledge testing

Evidence

MMLU-Pro — 79.1% on graduate knowledge

highVerified: 2026-07-09

output consistency

Internal consistency testing

Evidence

Google AI Documentation — Consistent outputs across requests

mediumVerified: 2026-07-09

latency p50

API latency measurements

Evidence

Community benchmarking — Median latency ~1.5s

mediumVerified: 2026-07-09

latency p95

95th percentile measurements

Evidence

Community benchmarking — p95 latency ~3.5s

mediumVerified: 2026-07-09

context window

Official specification

Evidence

Google AI Documentation — 1M token context window on the standard Gemini API; 2M extension limited to select Vertex AI enterprise tiers and never rolled out to the standard API

highVerified: 2026-07-09

uptime

Historical uptime data

Evidence

Google Cloud Status — 99.9% uptime (last 90 days)

highVerified: 2026-07-09

🛡️Security

Strong security leveraging Google Cloud infrastructure. Configurable safety filters provide flexibility.

prompt injection resistance

OWASP LLM security testing

Evidence

Google AI Safety — Strong prompt injection defenses

mediumVerified: 2026-07-09

jailbreak resistance

Adversarial prompt testing

Evidence

Community testing — Good resistance to jailbreak attempts

mediumVerified: 2026-07-09

data leakage prevention

Evidence

Google Privacy Policy — Data handling policies for Gemini API

mediumVerified: 2026-07-09

output safety

Safety testing

Evidence

Google Safety Filters — Configurable safety filters across categories

highVerified: 2026-07-09

api security

API security review

Evidence

Google Cloud Security — Google Cloud security standards

highVerified: 2026-07-09

🔒Privacy & Compliance

Good privacy with Google Cloud infrastructure. Enterprise options provide enhanced controls. HIPAA compliance available through Google Cloud.

data residency

Cloud infrastructure review

Evidence

Google Cloud Regions — Multiple region options via Google Cloud

highVerified: 2026-07-09

training data optout

Terms review

Evidence

Gemini API Terms — API data not used for training

highVerified: 2026-07-09

data retention

Data retention policy review

Evidence

Google Cloud Data Handling — Retention policies vary by service tier

mediumVerified: 2026-07-09

pii handling

Data protection review

Evidence

Google AI Safety — Customer responsible for PII handling

mediumVerified: 2026-07-09

compliance certifications

Certification verification

Evidence

Google Cloud Compliance — SOC 2, ISO 27001, GDPR compliant

highVerified: 2026-07-09

zero data retention

Enterprise feature review

Evidence

Enterprise Options — Available for enterprise customers

mediumVerified: 2026-07-09

👁️Trust & Transparency

Strong transparency with Deep Think mode and comprehensive documentation. Configurable guardrails provide flexibility.

explainability

Reasoning transparency evaluation

Evidence

Deep Think Feature — Deep Think mode exposes reasoning process

highVerified: 2026-07-09

hallucination rate

Factual QA testing

Evidence

Google AI Testing — Improved factual accuracy over Gemini 1.5

mediumVerified: 2026-07-09

bias fairness

Bias benchmark evaluation

Evidence

Google AI Principles — Regular bias testing and mitigation

mediumVerified: 2026-07-09

uncertainty quantification

Qualitative assessment

Evidence

Model behavior — Expresses uncertainty appropriately

mediumVerified: 2026-07-09

model card quality

Documentation review

Evidence

Gemini Model Card — Comprehensive model documentation

highVerified: 2026-07-09

training data transparency

Public disclosure review

Evidence

Google AI Blog — General training data description

mediumVerified: 2026-07-09

guardrails

Safety mechanism review

Evidence

Safety Settings — Configurable multi-category safety filters

highVerified: 2026-07-09

⚙️Operational Excellence

Excellent operational maturity backed by Google Cloud infrastructure, but the model is deprecated with a 2026-10-16 shutdown date. Versioning score reduced to reflect the pending retirement; migrate to Gemini 3.1 Pro.

api design quality

API design review

Evidence

Gemini API — RESTful API with streaming, function calling, native multimodal

highVerified: 2026-07-09

sdk quality

SDK quality assessment

Evidence

Google AI SDKs — SDKs for Python, Node.js, Go, Swift, Kotlin

highVerified: 2026-07-09

versioning policy

Versioning policy review

Evidence

Google Cloud Versioning — Clear versioning with migration guides

Gemini API Deprecations — gemini-2.5-pro scheduled for shutdown 2026-10-16 (extended from an original June 2026 date); designated replacement is gemini-3.1-pro

highVerified: 2026-07-09

monitoring observability

Observability tools review

Evidence

Google Cloud Console — Comprehensive Cloud Console monitoring

highVerified: 2026-07-09

support quality

Support assessment

Evidence

Google Cloud Support — Enterprise support with SLAs

highVerified: 2026-07-09

ecosystem maturity

Ecosystem analysis

Evidence

Google AI Ecosystem — Growing ecosystem with Google Cloud integration

highVerified: 2026-07-09

license terms

License review

Evidence

Google Cloud Terms — Standard commercial terms

highVerified: 2026-07-09

Strengths

+1M token context window (2M on select Vertex AI enterprise tiers)
+Deep Think mode for enhanced reasoning on complex problems
+Native multimodal capabilities (text, image, video, audio)
+Google Cloud infrastructure with enterprise-grade reliability
+Excellent for massive document analysis and research
+Competitive pricing with strong performance

Limitations

!Slightly behind Claude/GPT on specialized benchmarks
!Deep Think mode increases latency significantly
!Data retention policies less transparent than Anthropic
!Smaller ecosystem than OpenAI
!Superseded by the Gemini 3.x line (Gemini 3.1 Pro is the current Pro tier)
!DEPRECATED: shutdown scheduled 2026-10-16; migrate to Gemini 3.1 Pro

Metadata

pricing

input: $1.25 per 1M tokens (<200k context), $2.50 per 1M tokens (>200k context)

output: $10.00 per 1M tokens (<200k context), $15.00 per 1M tokens (>200k context)

notes: Tiered pricing based on context length. Confirmed unchanged on the official pricing page as of 2026-07-09; billing ends when the model shuts down 2026-10-16.

last verified: 2026-07-09

context window: 1000000

languages

0: English

1: 100+ languages

modalities

0: text

1: vision

2: audio

3: video

api endpoint: https://generativelanguage.googleapis.com/v1beta/models

open source: false

architecture: Multimodal transformer with Deep Think reasoning

parameters: Not disclosed

Use Case Ratings

code generation

Strong coding capabilities. Excellent for code explanation and documentation with long context.

customer support

Good for customer support with multimodal capabilities. Can process images and documents natively.

content creation

Excellent for content creation with good creativity and natural writing.

data analysis

Outstanding for data analysis with 1M context enabling analysis of massive datasets.

research assistant

Exceptional for research with 1M context. Can process entire books, papers, and repositories.

legal compliance

Good for legal work with massive context enabling full contract analysis.

healthcare

Good capabilities with HIPAA compliance available via Google Cloud. Large context useful for medical records.

financial analysis

Strong for financial analysis with ability to process large financial documents.

education

Excellent for education with multimodal capabilities and patient explanations.

creative writing

Good for creative writing with strong narrative capabilities.

Similar Models

Gemini 3.1 Pro

Google

Gemini 3.5 Flash

Google

Claude Opus 4.1

Anthropic

GPT-5

OpenAI

Claude Sonnet 4.5

Anthropic