Evaluation record · claude-sonnet-4

Claude Sonnet 4

v20250514

Anthropic

Modelretiredcodinghipaa-eligibleprivacy

Strong

About This Model

RETIRED: Anthropic retired Claude Sonnet 4 (claude-sonnet-4-20250514) on 2026-06-15 (deprecated 2026-04-14); API requests now fail. Recommended replacement: Claude Sonnet 4.6. Historically a May 2025 hybrid model with exceptional coding capabilities, advanced reasoning, and extended thinking mode.

Last Evaluated: July 9, 2026

Official Website

Trust Vector Analysis

Dimension Breakdown

🚀Performance & Reliability

Historical evaluation: exceptional coding performance at release (72.7% SWE-bench) with extended thinking and 200K context. Model retired 2026-06-15 and is no longer served.

task accuracy code

Industry-standard coding benchmarks

Evidence

SWE-bench Verified — 72.7% resolution rate

highVerified: 2026-07-09

task accuracy reasoning

Graduate-level reasoning benchmarks

Evidence

GPQA Diamond — PhD-level reasoning capabilities

highVerified: 2026-07-09

task accuracy general

Knowledge testing benchmarks

Evidence

MMLU — Strong comprehensive knowledge

highVerified: 2026-07-09

output consistency

Internal testing with repeated prompts

Evidence

Anthropic Documentation — Consistent outputs with hybrid reasoning

mediumVerified: 2026-07-09

latency p50

Median latency for API requests

Evidence

Anthropic API Documentation — Fast response time ~1.5s

highVerified: 2026-07-09

latency p95

95th percentile response time

Evidence

Community benchmarking — p95 latency ~3.2s

highVerified: 2026-07-09

context window

Official specification

Evidence

Anthropic API Documentation — 200K token context window

highVerified: 2026-07-09

uptime

Historical uptime data

Evidence

Anthropic Status Page — 99.95% uptime (last 90 days)

highVerified: 2026-07-09

🛡️Security

Excellent security with Constitutional AI providing strong guardrails. Best-in-class safety for enterprise use.

prompt injection resistance

Testing against OWASP LLM01 attacks

Evidence

Anthropic Safety Research — Strong resistance via Constitutional AI

highVerified: 2026-07-09

jailbreak resistance

Testing against adversarial prompts

Evidence

Anthropic Constitutional AI — Robust jailbreak resistance

highVerified: 2026-07-09

data leakage prevention

Analysis of privacy policies

Evidence

Anthropic Privacy Statement — No training on user data without consent

mediumVerified: 2026-07-09

output safety

Safety testing across harmful content categories

Evidence

Anthropic Safety Evaluations — Comprehensive safety testing

highVerified: 2026-07-09

api security

Review of API security features

Evidence

Anthropic API Documentation — API key authentication, HTTPS, rate limiting

highVerified: 2026-07-09

🔒Privacy & Compliance

Exceptional privacy with ephemeral data handling. HIPAA eligible. Strong compliance posture for regulated industries.

data residency

Review of enterprise documentation

Evidence

Anthropic Enterprise Documentation — Data residency options available

highVerified: 2026-07-09

training data optout

Analysis of privacy policy

Evidence

Anthropic Privacy Policy — No training on API data by default

highVerified: 2026-07-09

data retention

Review of terms of service

Evidence

Anthropic Terms of Service — Ephemeral processing, no storage

highVerified: 2026-07-09

pii handling

Review of data protection capabilities

Evidence

Anthropic Privacy Documentation — Customer responsible for PII redaction

mediumVerified: 2026-07-09

compliance certifications

Verification of compliance certifications

Evidence

Anthropic Trust Center — SOC 2 Type II, GDPR compliant, HIPAA eligible

highVerified: 2026-07-09

zero data retention

Review of data handling practices

Evidence

Anthropic API Documentation — Ephemeral data processing

highVerified: 2026-07-09

👁️Trust & Transparency

Strong transparency with Constitutional AI and extended thinking feature. Comprehensive model card available.

explainability

Evaluation of reasoning transparency

Evidence

Extended Thinking Feature — Extended thinking mode shows reasoning process

highVerified: 2026-07-09

hallucination rate

Testing on factual QA datasets

Evidence

SimpleQA Benchmark — Good factual accuracy

mediumVerified: 2026-07-09

bias fairness

Evaluation on bias benchmarks

Evidence

Anthropic Responsible Scaling Policy — Regular bias testing and mitigation

mediumVerified: 2026-07-09

uncertainty quantification

Qualitative assessment

Evidence

Model Behavior — Good uncertainty expression

mediumVerified: 2026-07-09

model card quality

Review of documentation

Evidence

Anthropic Model Card — Comprehensive system card with detailed evaluations

highVerified: 2026-07-09

training data transparency

Review of public disclosures

Evidence

Anthropic Public Statements — General description provided, cutoff March 2025

mediumVerified: 2026-07-09

guardrails

Analysis of safety mechanisms

Evidence

Constitutional AI — Built-in Constitutional AI guardrails

highVerified: 2026-07-09

⚙️Operational Excellence

Model retired 2026-06-15 on Anthropic-operated platforms; API requests fail. Migration target is Claude Sonnet 4.6. Versioning, ecosystem, and overall scores reduced to reflect retirement.

api design quality

Review of API design

Evidence

Anthropic API Documentation — Well-designed RESTful API

highVerified: 2026-07-09

sdk quality

Review of SDK quality

Evidence

Anthropic SDKs — Official SDKs for Python, TypeScript

highVerified: 2026-07-09

versioning policy

Review of versioning

Evidence

Anthropic API Versioning — 6-month deprecation notice

Anthropic Model Deprecations — claude-sonnet-4-20250514 deprecated 2026-04-14 and retired 2026-06-15; requests fail; recommended replacement claude-sonnet-4-6

highVerified: 2026-07-09

monitoring observability

Review of monitoring tools

Evidence

Anthropic Console — Usage dashboard with metrics

mediumVerified: 2026-07-09

support quality

Assessment of support

Evidence

Anthropic Support — Email support, Discord, comprehensive docs

highVerified: 2026-07-09

ecosystem maturity

Analysis of ecosystem

Evidence

GitHub Ecosystem — Mature ecosystem with integrations

highVerified: 2026-07-09

license terms

Review of licensing

Evidence

Anthropic Terms of Service — Clear commercial terms

highVerified: 2026-07-09

Strengths

+Exceptional coding performance (72.7% SWE-bench)
+Hybrid model with extended thinking for complex reasoning
+Excellent privacy posture with ephemeral data handling
+HIPAA eligible for healthcare applications
+Large 200K context window for document processing
+Constitutional AI provides robust safety

Limitations

!RETIRED 2026-06-15 — no longer available on the Claude API; requests fail (migrate to Claude Sonnet 4.6)
!Higher latency in extended thinking mode
!Training data cutoff March 2025
!No built-in PII detection
!Premium pricing ($3/$15 per 1M tokens)
!Superseded by Sonnet 4.5, 4.6, and Claude Sonnet 5

Metadata

pricing

input: $3.00 per 1M tokens

output: $15.00 per 1M tokens

notes: Historical pricing. Model retired 2026-06-15 — no longer purchasable on Anthropic-operated platforms.

last verified: 2026-07-09

context window: 200000

max output tokens: 64000

languages

0: English

1: Spanish

2: French

3: German

4: Italian

5: Portuguese

6: Japanese

7: Korean

8: Chinese

9: Arabic

10: Hindi

modalities

0: text

1: image (input)

2: document

api endpoint: https://api.anthropic.com/v1/messages

open source: false

architecture: Transformer-based with Constitutional AI and extended thinking

parameters: Not disclosed

training cutoff: March 2025

Use Case Ratings

code generation

Historically exceptional coding (72.7% SWE-bench). Retired — use Sonnet 4.6 or newer.

customer support

Excellent for customer support with empathetic responses and fast latency.

content creation

Strong content creation with natural writing style. Large context window helpful.

data analysis

Strong analytical capabilities with extended thinking for complex analysis.

research assistant

Excellent for research with strong summarization and 200K context window.

legal compliance

Strong privacy posture (HIPAA eligible) and careful reasoning for legal tasks.

healthcare

HIPAA eligible with strong privacy. Good for clinical documentation with oversight.

financial analysis

Strong financial analysis with extended thinking for complex modeling.

education

Good for educational content with patient explanations and strong knowledge.

creative writing

Strong creative writing with natural style. Good dialogue and character development.

Similar Models

Claude Sonnet 4.6

Anthropic

Claude Sonnet 4.5

Anthropic

Claude Opus 4.8

Anthropic

Claude Haiku 4.5

Anthropic