Qwen3.6-27B Outperforms 397B Model at a Fraction of the Size, Runs Locally at 16.8GB

Apr 23, 2026

Simon Willison’s Weblog

Article image for Qwen3.6-27B Outperforms 397B Model at a Fraction of the Size, Runs Locally at 16.8GB

Summary

Qwen's new 27B model stuns the AI world by outperforming a massive 397B model across all major coding benchmarks while running locally on consumer hardware at just 16.8GB, generating complex code at 25 tokens per second.

Key Points

Qwen releases Qwen3.6-27B, a dense 27B model claiming flagship-level agentic coding performance that surpasses the much larger Qwen3.5-397B-A17B across all major coding benchmarks.
The new model weighs in at just 55.6GB compared to the previous flagship's 807GB, and a quantized Q4_K_M version from Unsloth runs locally at only 16.8GB using llama.cpp.
Testing on a local machine shows impressive results, with the quantized model generating complex SVG images at roughly 25 tokens per second, demonstrating strong creative and coding capabilities for its size.

Qwen3.6-27B Outperforms 397B Model at a Fraction of the Size, Runs Locally at 16.8GB

Summary

Key Points

Tags