GLM-5.2 Open Weights Released: 753B MoE Model Nears Claude Opus 4.8 in Coding
After making GLM-5.2 available to GLM Coding Plan subscribers on June 13, Z.ai officially released the full model weights on HuggingFace and ModelScope on June 17 under the MIT license. The release also came with the first official benchmark table, which had been withheld at launch — and the numbers show a significant jump over May's GLM-5.1.
Model Specs
GLM-5.2 is a 753B-parameter mixture-of-experts (MoE) model with 44B active parameters per inference. The context window extends from 5.1's 200K to 1M tokens, with a 128K output limit. The key architectural innovation is IndexShare: every 4 Transformer layers share a lightweight indexer, reducing per-token FLOPs by 2.9x at 1M context length. The MTP (multi-token prediction) layer also sees improvements, boosting speculative decoding acceptance length by 20%.
The model introduces three-tier effort control: Lite, Pro, and Max. Lite provides fast responses for simple tasks, while Max allocates more compute for complex coding work — a step up from the previous binary thinking/standard modes.
Benchmarks: New Open-Source Ceiling
The benchmark table covers three long-horizon coding evaluations and four standard coding benchmarks:
Long-Horizon Coding (Hours-Level)
| Benchmark | GLM-5.2 | Claude Opus 4.8 | GPT-5.5 |
|---|---|---|---|
| FrontierSWE | 74.4 | 75.1 | 73.4 |
| PostTrainBench | 34.3 | #1 | Below GLM-5.2 |
| SWE-Marathon | 13.0 | #1 | — |
FrontierSWE measures a model's ability to complete open-ended technical projects spanning hours. GLM-5.2 trails Opus 4.8 by just 1 point, beats GPT-5.5 by 1 point, and surpasses Opus 4.7 by 11 points. On PostTrainBench, GLM-5.2 ranks second, outperforming both Opus 4.7 and GPT-5.5. SWE-Marathon shows a larger gap — about 13 points behind Opus 4.8 — but GLM-5.2 remains the highest-ranked open-source model.
Standard Coding
| Benchmark | GLM-5.2 | GLM-5.1 | Claude Opus 4.8 |
|---|---|---|---|
| Terminal-Bench 2.1 | 81.0 | 63.5 | 85.0 |
| SWE-bench Pro | 62.1 | 58.4 | 69.2 |
| FrontierSWE | 74.4 | 30.5 | 75.1 |
| MCP-Atlas | 76.8 | 71.8 | 77.8 |
Terminal-Bench 2.1 shows 81.0 vs 85.0 — a 4-point gap. SWE-bench Pro is 7 points behind. MCP-Atlas is just 1 point apart. For an MIT-licensed model, this is remarkably competitive.
The Pony Alpha Incident
Before the official release, Z.ai anonymously posted GLM-5.2 on OpenRouter under the codename "Pony Alpha." Blind community testing results: 25% guessed it was Claude Sonnet 5, 20% thought it was a new Grok version, and only a few correctly identified GLM-5. This test essentially proved that without brand labels, users rate GLM-5.2's capabilities on par with closed-source frontier models.
API Pricing
GLM-5 is priced at $0.60 (approx. ¥4.08) per million input tokens and $1.92 (approx. ¥13.06) per million output tokens. GLM-5.2's API pricing is expected to be similar. For comparison, Claude Opus 4.8 costs roughly $15 (approx. ¥102) per million input tokens — a 25x difference.
Z.ai also launched ZCode 3.0, providing GLM Coding Plan users with 3 million free tokens per day.
Deployment
Weights are available on HuggingFace (zai-org/GLM-5.2) and ModelScope. The full-precision model requires approximately 1.5TB of GPU memory; quantization lowers the barrier significantly. Supported inference frameworks include SGLang, vLLM, and xLLM.
Competitive Landscape
| Model | Parameters | Context | FrontierSWE | Terminal-Bench | Open Source |
|---|---|---|---|---|---|
| GLM-5.2 | 753B MoE | 1M | 74.4 | 81.0 | MIT |
| Claude Opus 4.8 | Undisclosed | 1M | 75.1 | 85.0 | No |
| GPT-5.5 | Undisclosed | Undisclosed | 73.4 | — | No |
| Kimi K2.7 Code | Undisclosed | 256K | — | — | Yes |
GLM-5.2 is now the ceiling for open-source coding models. It doesn't surpass Opus 4.8, but it beats GPT-5.5 on FrontierSWE — and it's MIT-licensed. For enterprises that need local deployment, this is the strongest open-source option available today.




