Gemma 4

v4.0

Google

Modelopen-sourceapache-2-0open-weightsedge
81
Strong
About This Model

Google's open-weight family released April 2026 under Apache 2.0 (a shift from the custom Gemma license). Spans E2B/E4B edge models with 128K context and native audio up to a 31B dense model with 256K context. The 31B scores ~1452 on LMArena, No. 3 among open models.

Last Evaluated: June 10, 2026
Official Website

Trust Vector Analysis

Dimension Breakdown

🚀Performance & Reliability
+

Strongest open-weight showing from Google to date: 31B at ~1452 LMArena (No. 3 open). MoE 26B-A4B offers near-dense quality at 4B active params. Performance below proprietary frontier but excellent per-parameter efficiency.

task accuracy code

Vendor-reported coding benchmarks compared against open-weight peer class

Evidence
Google Gemma 4 AnnouncementSubstantial coding gains over Gemma 3 across family; strong for open-weight class, below frontier proprietary models
mediumVerified: 2026-06-10
task accuracy reasoning

Reasoning benchmark review from launch materials and open-model leaderboards

Evidence
Hugging Face Gemma 4 BlogReasoning improvements over Gemma 3; 26B-A4B MoE delivers near-dense quality with 4B active parameters
mediumVerified: 2026-06-10
task accuracy general

Crowdsourced human preference rankings on LMArena

Evidence
LMArena LeaderboardGemma 4 31B ~1452 Elo (No. 3 among open models); 26B-A4B MoE ~1441 with only 4B active params
highVerified: 2026-06-10
output consistency

Community reports across deployment stacks; high variance by quantization level

Evidence
Community testingConsistency depends on quantization and inference stack chosen by deployer
lowVerified: 2026-06-10
latency p50

Self-hosted model; latency is a function of deployer infrastructure

Evidence
Hugging Face Gemma 4 BlogE2B (2.3B effective) and E4B (4.5B) run on-device; latency depends entirely on hardware and serving stack
lowVerified: 2026-06-10
context window

Official specification from launch announcement

Evidence
Google Gemma 4 Announcement256K context on 12B/26B-A4B/31B; 128K on E2B/E4B edge variants
highVerified: 2026-06-10
uptime

No single provider SLA; assessed as deployment-dependent

Evidence
Hugging Face Model HubOpen weights; availability depends on chosen hosting (self-hosted, Vertex AI, or third-party providers)
highVerified: 2026-06-10
🛡️Security
+

Security profile is deployment-dependent: excellent data isolation when self-hosted, but guardrails are removable and there is no managed abuse filtering unless the deployer adds it (e.g., ShieldGemma, Vertex AI).

prompt injection resistance

OWASP LLM01 assessment relative to model class; deployer must add input filtering

Evidence
Gemma Responsible AI ToolkitSafety tuning applied, but smaller open models are generally more susceptible than frontier hosted models; no managed input filtering by default
lowVerified: 2026-06-10
jailbreak resistance

Adversarial testing of instruction-tuned checkpoints; open weights inherently allow guardrail removal

Evidence
Gemma Safety DocumentationInstruction-tuned variants include safety alignment, but open weights permit fine-tuning that removes guardrails
lowVerified: 2026-06-10
data leakage prevention

Architectural assessment: no third-party data flow when self-hosted

Evidence
Self-hosted deployment modelSelf-hosting means no prompts or outputs leave deployer infrastructure
highVerified: 2026-06-10
output safety

Safety testing of released checkpoints and available companion classifiers

Evidence
Gemma Responsible AI ToolkitSafety-tuned checkpoints plus companion safety classifiers (ShieldGemma line) available
mediumVerified: 2026-06-10
api security

Assessment of typical self-hosted serving stacks vs managed alternatives

Evidence
Deployment optionsNo first-party managed API security; depends on serving stack (Vertex AI managed endpoints inherit GCP controls)
lowVerified: 2026-06-10
🔒Privacy & Compliance
+

Best-in-class data sovereignty: nothing leaves deployer infrastructure. The trade-off is that compliance certifications are not inherited from the model and must be built or bought by the deployer.

data residency

Architectural assessment of self-hosted deployment

Evidence
Open weights distributionWeights run anywhere: on-premises, air-gapped, or any cloud region
highVerified: 2026-06-10
training data optout

Architectural assessment: inference data never leaves deployer

Evidence
Self-hosted deployment modelNo user data ever transmitted to Google during inference; opt-out concern does not apply
highVerified: 2026-06-10
data retention

Architectural assessment

Evidence
Self-hosted deployment modelRetention policy is entirely the deployer's choice
highVerified: 2026-06-10
pii handling

Data flow analysis for self-hosted inference

Evidence
Self-hosted deployment modelPII never leaves deployer infrastructure, but redaction tooling must be self-implemented
mediumVerified: 2026-06-10
compliance certifications

Review of certification inheritance paths for open-weight deployments

Evidence
Deployment-dependent complianceModel itself carries no certifications; compliance (SOC 2, HIPAA, GDPR) must be achieved by the deployer's stack or inherited from a managed host like Vertex AI
mediumVerified: 2026-06-10
zero data retention

Architectural assessment

Evidence
Self-hosted deployment modelInherently zero retention when self-hosted
highVerified: 2026-06-10
👁️Trust & Transparency
+

High transparency by open-model standards: published technical report, architecture disclosure (including MoE active-parameter counts), and fully auditable weights. Apache 2.0 relicensing further reduces legal opacity.

explainability

Assessment of inspection capabilities afforded by open weights

Evidence
Open weights accessFull weight access enables interpretability research, logit inspection, and custom probing
mediumVerified: 2026-06-10
hallucination rate

Factual QA testing relative to model size class

Evidence
Community evaluationSmaller open models hallucinate more than frontier hosted models, especially E2B/E4B variants
lowVerified: 2026-06-10
bias fairness

Model card review and independent audit availability

Evidence
Gemma Model CardBias evaluations published in model card; open weights allow independent auditing
mediumVerified: 2026-06-10
uncertainty quantification

Calibration assessment; logprob access partially offsets weaker verbal uncertainty

Evidence
Open weights accessRaw logprobs fully accessible for custom calibration, but model self-expression of uncertainty is weaker than frontier tier
lowVerified: 2026-06-10
model card quality

Documentation completeness review

Evidence
Gemma 4 Technical Report and Model CardsDetailed technical report, per-size model cards, architecture details (MoE config, effective params), and evaluation suite published
highVerified: 2026-06-10
training data transparency

Public disclosure review against open-model norms

Evidence
Gemma 4 Technical ReportTraining data composition described at category level (better than most proprietary models); exact corpus not released
mediumVerified: 2026-06-10
guardrails

Analysis of built-in and companion safety mechanisms

Evidence
Responsible AI ToolkitSafety-tuned checkpoints plus optional classifier models; guardrails are removable by design in open weights
mediumVerified: 2026-06-10
⚙️Operational Excellence
+

Apache 2.0 relicensing is the headline trust improvement: prior Gemma generations carried custom-license use restrictions. Operational burden (monitoring, scaling, support) falls on the deployer, as with any open-weight model.

api design quality

Review of available serving interfaces and their consistency

Evidence
Deployment optionsNo single first-party API; served via Vertex AI, Hugging Face TGI, vLLM, Ollama, llama.cpp with varying interfaces
mediumVerified: 2026-06-10
sdk quality

Ecosystem tooling support assessment

Evidence
Hugging Face TransformersDay-one support in Transformers, vLLM, llama.cpp, Ollama, MLX, and Keras
highVerified: 2026-06-10
versioning policy

Release cadence and immutability review

Evidence
Gemma release historyClear generational releases (supersedes Gemma 3); pinned weights never change once published
mediumVerified: 2026-06-10
monitoring observability

Assessment of out-of-box observability versus managed APIs

Evidence
Self-hosted deployment modelNo built-in monitoring; deployer must assemble observability from serving-stack tooling
mediumVerified: 2026-06-10
support quality

Support channel assessment for open-weight distribution

Evidence
Community channelsCommunity support (Hugging Face, GitHub, Discord); no SLA unless deployed via managed platforms
mediumVerified: 2026-06-10
ecosystem maturity

Third-party integration and adoption analysis

Evidence
Hugging Face Gemma 4 LaunchDay-one integration across the open-model ecosystem; Gemma family has hundreds of millions of cumulative downloads
highVerified: 2026-06-10
license terms

License analysis; Apache 2.0 is OSI-approved with no usage restrictions

Evidence
Google Gemma 4 AnnouncementApache 2.0 — a shift from the custom Gemma license, removing use-restriction ambiguity for commercial deployment
highVerified: 2026-06-10
Strengths
  • +Apache 2.0 license — removes custom-license restrictions of prior Gemma generations
  • +Top-3 open model: 31B at ~1452 LMArena Elo
  • +Efficient MoE: 26B-A4B reaches ~1441 Elo with only 4B active parameters
  • +Full data sovereignty: self-hosted inference, zero data leaves deployer
  • +Edge-capable E2B/E4B variants with 128K context and native audio
  • +256K context on 12B/26B/31B variants — large for open weights
  • +Multimodal input: text, image, and video
Limitations
  • !No inherited compliance certifications; deployer builds or buys SOC 2/HIPAA posture
  • !Safety guardrails removable via fine-tuning (inherent to open weights)
  • !No first-party SLA or managed support outside Vertex AI hosting
  • !Hallucination and reasoning depth below frontier hosted models, especially E2B/E4B
  • !Operational burden (serving, scaling, monitoring) falls on deployer
  • !Performance varies significantly with quantization choices
Metadata
pricing
input: Free (open weights; compute costs only)
output: Free (open weights; compute costs only)
notes: Apache 2.0. Self-hosting compute is the only cost; managed hosting available via Vertex AI and third-party providers.
last verified: 2026-06-10
context window: 262144
max output: 32768
languages
0: English
1: 140+ languages
modalities
0: text
1: image (input)
2: video (input)
3: audio (input, E2B/E4B)
api endpoint: https://huggingface.co/google
open source: true
architecture: Family: E2B (2.3B effective) and E4B (4.5B) edge models; 12B dense; 26B-A4B MoE (4B active); 31B dense
parameters: 2.3B effective (E2B) to 31B dense; 26B MoE with 4B active
knowledge cutoff: Late 2025 (not officially confirmed)
release date: 2026-04-02

Use Case Ratings

code generation

Capable for an open model, especially 31B with 256K context, but well below frontier proprietary coding models.

customer support

26B-A4B MoE (4B active) gives strong quality at low serving cost for high-volume support; E4B enables on-device assistants.

content creation

Solid drafting quality at 31B (~1452 LMArena); fully private content pipelines possible.

education

E2B/E4B with native audio enable offline, on-device tutoring in low-connectivity settings.

healthcare

Self-hosting suits strict data sovereignty (PHI never leaves infrastructure), but deployer carries the full compliance and accuracy-validation burden.

research assistant

256K context on 31B handles long documents; auditable weights suit reproducible research. Reasoning depth below frontier.