Stay connected with us on X, Discord, and LinkedIn to stay up to date with future analysis
All articles

February 17, 2026

Qwen3.5-397B-A17B: Everything You Need to Know

Alibaba's new Qwen3.5-397B-A17B is the #3 open weights model in the Artificial Analysis Intelligence Index - a significant upgrade from Qwen3-235B-A22B-2507

Qwen3.5-397B-A17B is the first model released by Alibaba under the new Qwen3.5 family. It scores 45 on the Artificial Analysis Intelligence Index, ranking #3 among open weights models behind GLM-5 (Reasoning, 50) and Kimi K2.5 (Reasoning, 47). It has 397B/17B total/active parameters, significantly lower than peer models such as Kimi K2.5 (1T/32B), GLM-5 (744B/40B) and DeepSeek V3.2 (671B/37B).

Qwen3.5 397B is also the first Qwen open weights model with native vision input, supporting image and video natively. Previously, Alibaba maintained separate model lines for vision (Qwen3-VL) and text-only (Qwen3). Qwen3.5 397B unifies these into a single model, following the broader industry trend toward natively multimodal foundation models.

Additionally, Qwen3.5 397B supports both reasoning and non-reasoning modes within a single model - a reversal compared to the Qwen3 family of models where Alibaba released separate instruct and thinking variants.

Key takeaways from our independent benchmarking:

➤ 🧠 Intelligence gains driven by improved agentic performance: Qwen3.5 397B scores 45 on our Intelligence Index, a +16 point gain over the previous open weights Qwen3 235B (Reasoning, 29). Qwen3.5 397B achieves a GDPval-AA ELO of 1,221, a significant increase of 361 points compared to Qwen3 235B (860). GDPval-AA is a frontier agentic eval that compares model outputs on realistic knowledge work tasks like preparing presentations, analysis, and more. Qwen3.5 397B also improves over Qwen3 235B across agentic coding (+27 p.p. on TerminalBench Hard), scientific reasoning (+12 p.p. on HLE) and instruction following (+28 p.p. on IFBench).

➤ 📉 Hallucination remains higher than peers: Qwen3.5 397B's AA-Omniscience Index is -32, a 16-point improvement over Qwen3 235B (-48), driven primarily by higher accuracy (30% vs 22%) rather than a reduction in hallucination rate (88% vs 90%). The model still has a high hallucination rate relative to leading open weights models. We measure hallucination rate as how often the model answers a question when it should have refused or admitted to not knowing the answer. Kimi K2.5 and GLM-5 achieve an AA-Omniscience Index of -11 and -1 respectively.

➤ đŸĒ™ Slightly more token efficient compared to peers: Qwen3.5 397B used ~86M output tokens (including ~80M reasoning tokens) to run the Intelligence Index, more than Qwen3 235B (63M), but less than Kimi K2.5 (89M) and GLM-5 (110M)

Key Model Details:

  • 📏 Context window: 262K tokens

  • âš™ī¸ Size: 397B total / 17B active parameters (MoE). Fewer active parameters than Qwen3 235B (22B active), GLM-4.7 (32B active) and Kimi K2.5 (32B active).

  • ÂŠī¸ License: Apache 2.0.

  • 🌐 Availability: Qwen3.5 397B is available in Qwen Chat and via Alibaba's first-party API. Alibaba also offers Qwen3.5-Plus, a hosted variant with a 1M context window and built-in tool use. No third-party API providers at the time of publishing. Weights are available on HuggingFace

![Intelligence Index](Intelligence Index.png)

At 17B active parameters, Qwen3.5 397B is on the frontier of the Intelligence vs. Active Parameters chart

![Intelligence vs Active Parameters](Intelligence vs Active Parameters.png)

Based on Alibaba Cloud's per-token pricing ($0.60/$3.60 per 1M input/output) and the number of tokens used to run the Intelligence Index, Qwen3.5 397B is close to the Pareto frontier of Intelligence vs. Cost to Run the Intelligence Index chart

![Intelligence vs Cost](Intelligence vs Cost.png)

Qwen3.5 397B performs significantly better than Qwen3 235B at GDPval-AA, an eval that measures the performance of the model at real world agentic tasks

GDPval-AAGDPval-AA

Qwen3.5 397B improves at AA-Omniscience Index compared to Qwen3 235B with higher accuracy, but limited improvement in hallucination rate

OmniscienceOmniscience

Check out additional analysis for this model on X: https://x.com/ArtificialAnlys/status/2023794497055060262?s=20 Explore the full suite of benchmarks at https://artificialanalysis.ai/