GLM-5

v20260211

Z.ai (Zhipu AI)

Modelcodingreasoningopen-sourcemit-license
82
Strong
About This Model

Z.ai's MIT-licensed 744B-parameter MoE (40B active) with 77.8% SWE-bench Verified, 92.7% AIME 2026, and open-source leadership on BrowseComp and agentic benchmarks. Trained on 28.5T tokens with DeepSeek Sparse Attention.

Last Evaluated: June 10, 2026
Official Website

Trust Vector Analysis

Dimension Breakdown

🚀Performance & Reliability
+

Among the strongest open-weight models released to date: 77.8% SWE-bench Verified, 92.7% AIME 2026, 86.0% GPQA-Diamond, with independent confirmation of open-source leadership on BrowseComp, Vending Bench 2, and MCP-Atlas.

task accuracy code

Industry-standard coding benchmarks with independent third-party verification

Evidence
SWE-bench Verified77.8% resolution rate
Artificial Analysis independent evaluationTop-tier open-weight coding performance, confirmed by independent harness
highVerified: 2026-06-10
task accuracy reasoning

Graduate and competition-level reasoning benchmarks requiring multi-step problem solving

Evidence
AIME 202692.7% on competition mathematics
GPQA-Diamond86.0% on PhD-level science questions
highVerified: 2026-06-10
task accuracy general

Independent agentic and general-capability benchmarking across domains

Evidence
Artificial AnalysisOpen-source leader on BrowseComp, Vending Bench 2, and MCP-Atlas agentic benchmarks
highVerified: 2026-06-10
output consistency

Community testing with repeated prompts and long agent runs

Evidence
GLM-5 GitHub repositoryStable behavior across agentic trajectories; reproducible deployment recipes published
mediumVerified: 2026-06-10
latency p50

Median latency for API requests with standard prompt sizes

Evidence
Artificial AnalysisTypical first-party API response ~2.6s; DeepSeek Sparse Attention keeps long-context latency manageable
mediumVerified: 2026-06-10
latency p95

95th percentile response time across diverse workloads

Evidence
Artificial Analysisp95 ~6.0s across diverse workloads
mediumVerified: 2026-06-10
context window

Official specification from model card

Evidence
GLM-5 Model Card200K token context window
highVerified: 2026-06-10
uptime

Review of platform availability and self-hosting fallback options

Evidence
Z.ai PlatformFirst-party API generally stable; open weights enable self-hosted redundancy
mediumVerified: 2026-06-10
🛡️Security
+

Solid for an open model; no published third-party security audit. Self-hosting shifts responsibility to the deployer.

prompt injection resistance

Review of safety documentation and community testing against OWASP LLM01 patterns

Evidence
GLM-5 documentationSafety tuning applied; resilient in agentic browsing benchmarks but no dedicated third-party injection audit published
mediumVerified: 2026-06-10
jailbreak resistance

Testing against adversarial prompt datasets; deployer-dependent for self-hosted use

Evidence
Community red-teamingStandard alignment; open weights allow guardrail removal in derivatives
mediumVerified: 2026-06-10
data leakage prevention

Analysis of privacy policies and self-hosting data-control options

Evidence
Z.ai Privacy PolicyStandard data handling on first-party API; full control when self-hosted
mediumVerified: 2026-06-10
output safety

Safety testing across harmful content categories

Evidence
GLM-5 Model CardSafety post-training; refusal behavior comparable to peer open frontier models
mediumVerified: 2026-06-10
api security

Review of API security features and best practices

Evidence
Z.ai API DocumentationAPI key authentication, HTTPS only, rate limiting; OpenAI-compatible endpoints
mediumVerified: 2026-06-10
🔒Privacy & Compliance
+

First-party Z.ai API operates under Chinese jurisdiction — a material caveat for Western regulated industries. The unencumbered MIT license makes self-hosting or Western-host deployment a clean mitigation.

data residency

Review of provider jurisdiction and third-party hosting options

Evidence
Z.ai Platform DocumentationZhipu/Z.ai is a China-based provider; first-party API data processed under Chinese jurisdiction
OpenRouter availabilityMIT weights hosted by Western inference providers, enabling non-China residency
mediumVerified: 2026-06-10
training data optout

Analysis of privacy policy and data usage terms

Evidence
Z.ai Privacy PolicyStandard API data terms; self-hosting removes the concern entirely
mediumVerified: 2026-06-10
data retention

Review of terms of service and deployment-dependent retention

Evidence
Z.ai Terms of ServiceFirst-party retention governed by Chinese data regulations; self-hosted deployments retain nothing externally
mediumVerified: 2026-06-10
pii handling

Review of data protection capabilities and customer responsibilities

Evidence
Z.ai DocumentationCustomer responsible for PII redaction; no managed PII tooling
mediumVerified: 2026-06-10
compliance certifications

Verification of compliance certifications and audit reports

Evidence
Z.ai public materialsNo published SOC 2 / HIPAA / GDPR attestations for the first-party API; Western hosts may carry their own certifications
mediumVerified: 2026-06-10
zero data retention

Review of self-hosting deployment options enabling zero retention

Evidence
Open weights on Hugging FaceMIT-licensed self-hosting gives complete data control and zero external retention
mediumVerified: 2026-06-10
👁️Trust & Transparency
+

Above-average transparency for an open frontier model: architecture, training scale, and benchmarks well documented with independent verification. Bias and safety evaluations remain thin.

explainability

Evaluation of reasoning transparency and trajectory inspectability

Evidence
GLM-5 documentationReasoning traces and agentic tool-call logs inspectable; strong MCP-Atlas results reflect transparent tool use
mediumVerified: 2026-06-10
hallucination rate

Testing on factual QA and grounded research benchmarks

Evidence
BrowseComp resultsOpen-source leader on grounded web-research benchmark, indicating disciplined sourcing behavior
mediumVerified: 2026-06-10
bias fairness

Review of published bias benchmarks and community evaluations

Evidence
GLM-5 Model CardLimited published bias evaluation
lowVerified: 2026-06-10
uncertainty quantification

Qualitative assessment of confidence expression in outputs

Evidence
Model behavior testingExpresses uncertainty adequately; no calibrated confidence outputs
mediumVerified: 2026-06-10
model card quality

Review of documentation completeness and clarity

Evidence
Hugging Face model card and GitHub repoDetailed disclosure: 744B/40B MoE, 256 experts, DeepSeek Sparse Attention, 28.5T pretraining tokens, full benchmark tables
highVerified: 2026-06-10
training data transparency

Review of public disclosures about training data

Evidence
GLM-5 technical disclosurePretraining scale (28.5T tokens) and recipe outlined; detailed data sources not disclosed
mediumVerified: 2026-06-10
guardrails

Analysis of built-in safety mechanisms

Evidence
GLM-5 Model CardBuilt-in safety tuning; deployers of open weights must layer their own guardrails
mediumVerified: 2026-06-10
⚙️Operational Excellence
+

Clean MIT licensing and strong ecosystem support. Note the rapid successor cadence: GLM-5.1 (same architecture) shipped within two months; evaluate which version your hosts actually serve.

api design quality

Review of API design, consistency, and feature completeness

Evidence
Z.ai API DocumentationOpenAI-compatible API with streaming, tool calling, structured output
highVerified: 2026-06-10
sdk quality

Review of SDK quality, documentation, and maintenance

Evidence
Z.ai GitHub organizationOfficial repos with deployment recipes; OpenAI-compatible so mainstream SDKs work
mediumVerified: 2026-06-10
versioning policy

Review of versioning practices and weight availability across releases

Evidence
GLM release historyFast cadence: GLM-5.1 API launched 2026-03-27 with weights on 2026-04-07, same architecture; GLM-5 weights remain available
mediumVerified: 2026-06-10
monitoring observability

Review of available monitoring tools and metrics

Evidence
Z.ai PlatformBasic usage dashboard; self-hosted observability is deployer-built
mediumVerified: 2026-06-10
support quality

Assessment of documentation, community, and support responsiveness

Evidence
Z.ai community channelsActive GitHub support and documentation; limited English-language enterprise support
mediumVerified: 2026-06-10
ecosystem maturity

Analysis of third-party hosting, integrations, and tooling

Evidence
Inference ecosystemDay-one vLLM/SGLang support, OpenRouter and major Western host availability
highVerified: 2026-06-10
license terms

Review of licensing terms and restrictions

Evidence
MIT LicenseUnencumbered MIT license, unrestricted commercial use and derivatives
highVerified: 2026-06-10
Strengths
  • +Frontier-competitive results: 77.8% SWE-bench Verified, 92.7% AIME 2026, 86.0% GPQA-Diamond
  • +Independently verified open-source leadership on BrowseComp, Vending Bench 2, and MCP-Atlas
  • +Unencumbered MIT license with full self-hosting rights
  • +Efficient inference: 40B active of 744B total with DeepSeek Sparse Attention
  • +Very competitive API pricing (~$0.60/$1.92 per 1M tokens)
  • +Well-documented training scale (28.5T tokens) and architecture
Limitations
  • !First-party Z.ai API processes data under Chinese jurisdiction with limited Western compliance certifications
  • !Text-only — no vision or audio modalities
  • !Rapid successor cadence (GLM-5.1 within two months) creates version-tracking overhead
  • !Limited published bias, safety, and red-team evaluations
  • !Self-hosting a 744B MoE requires substantial GPU infrastructure
  • !English-language enterprise support is thin compared to Western providers
Metadata
pricing
input: $0.60 per 1M tokens (approx.)
output: $1.92 per 1M tokens (approx.)
notes: First-party Z.ai API pricing; third-party hosts vary. Successor GLM-5.1 priced similarly.
last verified: 2026-06-10
context window: 200000
languages
0: English
1: Chinese
2: Japanese
3: Korean
4: Spanish
5: French
6: German
modalities
0: text
api endpoint: https://api.z.ai/api/paas/v4/chat/completions
open source: true
license: MIT
architecture: Mixture-of-Experts: 744B total / 40B active parameters, 256 experts, DeepSeek Sparse Attention; 28.5T pretraining tokens
parameters: 744B total / 40B active
release date: 2026-02-11

Use Case Ratings

code generation

77.8% SWE-bench Verified with independent verification; best-in-class open-weight coding at ~$0.60/$1.92 per 1M.

customer support

Capable and inexpensive, but not specialized for support workflows.

content creation

Strong long-form generation at very low cost.

data analysis

Excellent mathematical reasoning (92.7% AIME 2026) and agentic tool use for analysis pipelines.

research assistant

Open-source leader on BrowseComp and MCP-Atlas; strong grounded web research.

legal compliance

China-jurisdiction first-party API and absent Western certifications are blockers unless self-hosted.

healthcare

Not recommended via first-party API; self-hosted deployment in a compliant environment is the only viable path.

financial analysis

Top-tier quantitative reasoning; regulated firms should self-host or use certified Western hosts.

education

Outstanding math and science tutoring (92.7% AIME, 86.0% GPQA-Diamond) at budget pricing.

creative writing

Competent prose; optimized for reasoning and agentic work rather than creative style.