Qwen3.5

v20260216

Alibaba

Modelopen-sourceapache-2-0multimodalmultilingual
85
Strong
About This Model

Alibaba's Apache-2.0 flagship open model: Qwen3.5-397B-A17B, a hybrid MoE with 512 experts (397B total / 17B active) that is natively multimodal, supports 262K context (1M on hosted Qwen3.5-Plus) and 201 languages, and beats Alibaba's own API-only 1T-parameter Qwen3-Max while decoding up to 19x faster at long context.

Last Evaluated: June 10, 2026
Official Website

Trust Vector Analysis

Dimension Breakdown

🚀Performance & Reliability
+

Beats Alibaba's own 1T-parameter API-only Qwen3-Max with only 17B active parameters, with up to 19x faster decode at 256K context. Native multimodality and 201-language coverage are unmatched among open models.

task accuracy code

Vendor benchmarks corroborated by independent press coverage and community leaderboards

Evidence
Qwen3.5 Release BlogStrong agentic coding results; flagship 397B-A17B surpasses Qwen3-Max on coding benchmarks
VentureBeatIndependent reporting confirms 397B-A17B beats the 1T-parameter, API-only Qwen3-Max
highVerified: 2026-06-10
task accuracy reasoning

Mathematical and agentic reasoning benchmarks from the model card and release blog, cross-checked against community evaluations

Evidence
Qwen3.5 Release BlogFrontier-class math and agentic reasoning under the 'Towards Native Multimodal Agents' positioning
Hugging Face Model CardDetailed benchmark tables across reasoning suites on the official model card
highVerified: 2026-06-10
task accuracy general

Comprehensive knowledge and multimodal benchmark review including multilingual coverage

Evidence
Qwen3.5 Release BlogNatively multimodal (text + vision) with strong general knowledge across 201 languages
highVerified: 2026-06-10
output consistency

Repeated-prompt testing across temperature settings, supplemented by community reports

Evidence
Hugging Face Model CardStable hybrid-MoE routing; consistent outputs across repeated runs in community testing
mediumVerified: 2026-06-10
latency p50

Median latency on hosted endpoints and decode-throughput comparisons from independent reporting

Evidence
VentureBeatUp to 19x faster decode than Qwen3-Max at 256K context thanks to the sparse 17B-active design
mediumVerified: 2026-06-10
latency p95

95th percentile response time across diverse workloads from independent benchmarking

Evidence
Artificial Analysisp95 ~3.8s on hosted endpoints for standard workloads
mediumVerified: 2026-06-10
context window

Official specification from model card and Alibaba Cloud documentation

Evidence
Hugging Face Model Card262K native context for open weights; hosted Qwen3.5-Plus extends to 1M tokens
highVerified: 2026-06-10
uptime

Hosted-platform availability history plus redundancy across third-party hosts

Evidence
Alibaba Cloud Model StudioStable availability on Alibaba Cloud (incl. Singapore region) plus many third-party hosts and self-hosting
mediumVerified: 2026-06-10
🛡️Security
+

Solid default guardrails with notably broad multilingual safety coverage. Multimodal inputs widen the attack surface; open weights shift responsibility to deployers who fine-tune.

prompt injection resistance

Testing against OWASP LLM01 prompt injection patterns, including image-borne injection for multimodal inputs

Evidence
Community red-team evaluationsGood resistance to common injection patterns; multimodal inputs add an image-based injection surface
mediumVerified: 2026-06-10
jailbreak resistance

Adversarial prompt testing; assessment accounts for open-weight modifiability

Evidence
Qwen Safety DocumentationSafety post-training across the family; open weights mean alignment is removable downstream
mediumVerified: 2026-06-10
data leakage prevention

Analysis of hosted-platform policies plus the self-hosting option for full data isolation

Evidence
Alibaba Cloud Privacy DocumentationStandard data handling on hosted endpoints; self-hosting gives complete data control
mediumVerified: 2026-06-10
output safety

Safety testing across harmful content categories and multiple languages on default weights

Evidence
Qwen3.5 Release BlogMultilingual safety filtering across 201 languages; refusal behavior consistent in community testing
mediumVerified: 2026-06-10
api security

Review of API security features on the first-party hosted platform

Evidence
Alibaba Cloud Model StudioAPI key authentication, HTTPS, RAM-based access control, and rate limiting on Alibaba Cloud
highVerified: 2026-06-10
🔒Privacy & Compliance
+

Alibaba's first-party API is China-jurisdiction (Singapore region available), which concerns Western regulated buyers; Apache-2.0 self-hosting or Western third-party hosting fully avoids that. Alibaba Cloud's infrastructure certifications are stronger than DeepSeek's platform but still lack HIPAA/FedRAMP for the model service.

data residency

Review of hosting regions and licensing; China-jurisdiction caveat applies to Alibaba's first-party API, not self-hosted or Western-hosted deployments

Evidence
Alibaba Cloud RegionsFirst-party hosting on Alibaba Cloud is China-jurisdiction (with a Singapore international region); Apache-2.0 weights allow deployment in any jurisdiction
highVerified: 2026-06-10
training data optout

Analysis of hosted-platform data usage terms

Evidence
Alibaba Cloud Model Studio TermsEnterprise tier does not train on customer data; self-hosting removes the concern entirely
mediumVerified: 2026-06-10
data retention

Review of hosted-platform retention policies; retention is deployment-dependent for open-weight models

Evidence
Alibaba Cloud Trust CenterHosted retention follows Alibaba Cloud regional policies; self-hosted deployments retain nothing externally
mediumVerified: 2026-06-10
pii handling

Review of data protection capabilities and customer responsibilities

Evidence
Alibaba Cloud DocumentationCustomer responsible for PII redaction; Alibaba Cloud provides surrounding data-governance tooling
mediumVerified: 2026-06-10
compliance certifications

Verification of infrastructure certifications versus model-service-level compliance for Western regulated markets

Evidence
Alibaba Cloud Trust CenterAlibaba Cloud holds ISO 27001/SOC reports for its infrastructure, but no HIPAA/FedRAMP path for the model service; Western-host deployments inherit those hosts' certifications
mediumVerified: 2026-06-10
zero data retention

Review of data handling across first-party API, third-party hosts, and self-hosting

Evidence
Open-weight deployment optionsNo zero-retention guarantee on first-party hosting; self-hosting provides true zero external retention
mediumVerified: 2026-06-10
👁️Trust & Transparency
+

Strong open documentation and inspectable reasoning. Typical open-model gaps remain: limited training-data detail and topic-avoidance on politically sensitive subjects in default weights.

explainability

Evaluation of reasoning transparency and trace accessibility

Evidence
Qwen3.5 Release BlogHybrid thinking modes expose reasoning traces; fully inspectable when self-hosted
mediumVerified: 2026-06-10
hallucination rate

Testing on factual QA and multimodal grounding datasets

Evidence
Community factuality testingModerate hallucination rate, improved over Qwen3; grounding quality on vision inputs is strong
mediumVerified: 2026-06-10
bias fairness

Evaluation on bias benchmarks across languages and politically sensitive topic probes

Evidence
Independent bias evaluationsBroad multilingual fairness work; topic-avoidance on China-politically-sensitive subjects persists in default weights
mediumVerified: 2026-06-10
uncertainty quantification

Qualitative assessment of confidence expression in outputs

Evidence
Model behavior assessmentExpresses uncertainty in thinking mode; final-answer calibration is adequate but not exceptional
mediumVerified: 2026-06-10
model card quality

Review of model card and technical documentation completeness

Evidence
Hugging Face Model CardThorough model card with architecture details (512-expert hybrid MoE), benchmarks, usage guidance, and deployment recipes
highVerified: 2026-06-10
training data transparency

Review of public disclosures about training data

Evidence
Qwen3.5 Release BlogTraining methodology and multilingual/multimodal data strategy described at a high level; detailed composition not disclosed
mediumVerified: 2026-06-10
guardrails

Analysis of built-in safety mechanisms in default weights

Evidence
Qwen Safety DocumentationMultilingual safety alignment in released weights; removable by downstream fine-tuning
mediumVerified: 2026-06-10
⚙️Operational Excellence
+

Best-in-class open-model ecosystem: Apache 2.0 with patent grant, day-one inference-framework support, and a complete size ladder (0.8B to 397B-A17B) for matching capability to hardware. Supersedes the Qwen3 family and Qwen2.5-VL.

api design quality

Review of API design, consistency, and feature completeness

Evidence
Alibaba Cloud Model StudioOpenAI-compatible API with function calling, multimodal inputs, and hybrid thinking-mode controls
highVerified: 2026-06-10
sdk quality

Review of SDK and inference-framework support

Evidence
QwenLM GitHubDay-one support in vLLM, SGLang, and transformers; actively maintained official repos
highVerified: 2026-06-10
versioning policy

Review of release cadence and weight-availability guarantees

Evidence
Qwen Release HistoryFast release cadence (supersedes Qwen3 family and Qwen2.5-VL); open weights remain permanently available, softening deprecation impact
mediumVerified: 2026-06-10
monitoring observability

Review of monitoring tools across deployment options

Evidence
Alibaba Cloud Model StudioUsage dashboards and logging on Alibaba Cloud; full observability when self-hosting
mediumVerified: 2026-06-10
support quality

Assessment of support tiers, documentation, and community responsiveness

Evidence
Alibaba Cloud SupportAlibaba Cloud offers paid enterprise support tiers; Western-market support depth lags US hyperscalers; strong community channels
mediumVerified: 2026-06-10
ecosystem maturity

Analysis of derivative models, third-party hosting, and tooling integrations

Evidence
Hugging Face Qwen OrganizationLargest open-model ecosystem by derivative count; full size ladder from 397B-A17B and 122B-A10B down to 0.8B released Feb-Mar 2026
highVerified: 2026-06-10
license terms

Review of licensing terms and restrictions

Evidence
Hugging Face Model CardApache 2.0 across the family: unrestricted commercial use with explicit patent grant
highVerified: 2026-06-10
Strengths
  • +Beats Alibaba's own 1T-parameter API-only Qwen3-Max with just 17B active parameters (397B total)
  • +Natively multimodal open model: text + vision under 'Towards Native Multimodal Agents'
  • +Up to 19x faster decode than Qwen3-Max at 256K context; 262K native context (1M on hosted Qwen3.5-Plus)
  • +201-language coverage, the broadest of any open model
  • +Apache 2.0 license with patent grant across the entire family
  • +Complete size ladder (0.8B to 397B-A17B, Feb-Mar 2026) for matching capability to hardware
  • +Day-one vLLM/SGLang/transformers support and the largest open-model derivative ecosystem
Limitations
  • !First-party Alibaba Cloud hosting is China-jurisdiction (Singapore region available); no HIPAA/FedRAMP path for the model service — self-hosting or Western hosts avoid this
  • !1M context requires the hosted Qwen3.5-Plus; open weights cap at 262K
  • !Topic-avoidance on politically sensitive subjects in default weights
  • !Training-data composition disclosed only at a high level
  • !397B total parameters still require multi-GPU infrastructure to self-host despite the sparse 17B-active design
  • !Western-market enterprise support depth lags US hyperscalers
Metadata
pricing
input: Free weights (Apache 2.0); hosted from ~$0.40 per 1M tokens on Alibaba Cloud Model Studio
output: Hosted from ~$1.20 per 1M tokens; third-party hosts vary
notes: Self-hosting is infrastructure-cost-only; the 17B-active design keeps serving costs low for its capability class. Hosted Qwen3.5-Plus (1M context) priced separately.
last verified: 2026-06-10
context window: 262144
max output: 65536
languages
0: English
1: Chinese
2: Japanese
3: Korean
4: Spanish
5: French
6: German
7: Portuguese
8: Russian
9: Arabic
10: Hindi
11: Indonesian
12: Vietnamese
13: Thai
14: and 187 more (201 total)
modalities
0: text
1: image (input)
2: document
api endpoint: https://dashscope-intl.aliyuncs.com/compatible-mode/v1/chat/completions
open source: true
architecture: Hybrid Mixture-of-Experts with 512 experts (397B total / 17B active), natively multimodal, hybrid thinking modes
parameters: 397B total / 17B active (flagship); family spans 0.8B to 397B-A17B
knowledge cutoff: Late 2025

Use Case Ratings

code generation

Strong agentic coding that beats the 1T-parameter Qwen3-Max; 17B active params make self-hosted coding assistants economical.

customer support

201-language coverage and fast decode make it a standout for global multilingual support; smaller variants serve high-volume tiers cheaply.

content creation

Strong multilingual content with native image understanding for visually grounded writing.

data analysis

Native multimodality handles charts, tables, and documents directly; 262K context (1M on Plus) covers large datasets.

research assistant

Multimodal document understanding plus long context suits literature and mixed-media research; 19x decode speedup keeps long-context work responsive.

legal compliance

First-party hosting is China-jurisdiction; viable for regulated legal work only via self-hosting or certified Western hosts.

healthcare

No HIPAA path on first-party hosting; self-hosted deployment in compliant infrastructure is the only viable route.

financial analysis

Good quantitative reasoning with native chart/table understanding; data-residency planning required for regulated workloads.

education

201 languages, multimodal input, and a size ladder down to 0.8B make it exceptional for global and on-device education deployments.

creative writing

Capable multilingual creative output with visual grounding; prose distinctiveness behind dedicated creative leaders.