GPT-OSS-120B
OpenAI
Overall Trust Score
OpenAI's first open-weight model released August 2025. 117B total params (5.1B active), Apache 2.0 license. Matches o4-mini on many benchmarks. Runs in 80GB memory.
Trust Vector
Performance & Reliability
Flagship open-source performance. MoE architecture activates 5.1B of 117B params per token. Matches or beats o4-mini on most benchmarks.
task accuracy code92
task accuracy reasoning93
task accuracy general89
output consistency87
latency p50Value: 1.0s
latency p95Value: 2.2s
context windowValue: 128,000 tokens
uptime99
Security
Good base security. Self-hosting provides complete control over safety guardrails and data handling.
prompt injection resistance82
jailbreak resistance81
data leakage prevention95
output safety83
api security90
Privacy & Compliance
Perfect privacy when self-hosted. No data sent to OpenAI. Full compliance control. Ideal for regulated industries.
data residencyValue: Anywhere (self-hosted)
training data optout100
data retentionValue: 0 days (self-controlled)
pii handling100
compliance certifications95
zero data retention100
Trust & Transparency
Exceptional transparency. Full chain-of-thought access. Complete model weights and architecture disclosed. Open-source enables auditing.
explainability96
hallucination rate85
bias fairness83
uncertainty quantification86
model card quality98
training data transparency90
guardrails85
Operational Excellence
Exceptional operational flexibility. Apache 2.0 enables commercial use. Massive deployment ecosystem. Self-host or use managed platforms.
api design quality92
sdk quality96
versioning policy98
monitoring observability94
support quality88
ecosystem maturity97
license terms100
✨ Strengths
- •Apache 2.0 open-weight license enables commercial use without restrictions
- •Matches or beats o4-mini on coding, math, and health benchmarks
- •Complete data privacy when self-hosted (zero external data transmission)
- •Full chain-of-thought reasoning access for transparency and debugging
- •MoE architecture: 5.1B active of 117B total params, runs in 80GB
- •Massive deployment ecosystem (Azure, AWS, Hugging Face, vLLM, Ollama)
⚠️ Limitations
- •Requires 80GB GPU memory (H100 or equivalent)
- •Self-hosting complexity and infrastructure costs
- •Community support vs enterprise SLA
- •Slightly lower performance than flagship closed models
- •No built-in safety guardrails (customizable but requires setup)
📊 Metadata
Use Case Ratings
code generation
Excellent coding. Matches o4-mini. Configurable reasoning effort. Full chain-of-thought debugging.
customer support
Good for customer support. Self-host for complete data privacy. Configurable reasoning for cost control.
content creation
Strong content creation. Self-hosting enables unlimited generation without API costs.
data analysis
Excellent for data analysis. Keep sensitive data on-premises. Full chain-of-thought for transparency.
research assistant
Outstanding for research. 128K context. Self-host proprietary research data. Full reasoning transparency.
legal compliance
Perfect for legal. Self-host for complete compliance. No data leaves premises. Apache 2.0 license clarity.
healthcare
Ideal for healthcare. Self-host for HIPAA. Complete PHI privacy. No external data transmission.
financial analysis
Excellent for finance. Outperforms o3-mini on math. Self-host proprietary financial data.
education
Great for education. Full chain-of-thought shows reasoning steps. Self-host for institutional control.
creative writing
Good creative writing. Unlimited generation when self-hosted. No API costs for iteration.