Z.ai (Zhipu AI): GLM-5.2, MIT Open Weights & $6.55B IPO
Z.ai (formerly Zhipu AI) builds the MIT-licensed GLM model family. GLM-5.2: 744B MoE, 1M-token context, $1.40/1M input. China's first listed frontier AI lab, IPO'd Jan 2026 at $6.55B.
Z.ai, formerly Zhipu AI, is a Beijing-based AI company founded in 2019 from Tsinghua University that builds the open-source GLM large language model family. It became China's first publicly listed frontier AI company on the Hong Kong Stock Exchange in January 2026 at a $6.55 billion valuation. Its flagship model GLM-5.2 features a 1-million-token context window, 744B MoE architecture, and MIT license at $1.40 per million input tokens.
Founded: 2019 · HQ: Beijing, China · Team: 1000+ · CEO: Tang Jie · Funding: $1.996B total (IPO Jan 2026: $558M at $6.55B valuation) · Valuation: $6.55B (Hong Kong Stock Exchange IPO, January 2026)
About Z.ai
Z.ai is an artificial intelligence company founded in 2019 by researchers from Tsinghua University's Knowledge Engineering Group in Beijing, China. The company initially operated under the name Zhipu AI and rebranded internationally to Z.ai following its IPO. On January 8, 2026, Z.ai debuted on the Hong Kong Stock Exchange, raising approximately $558 million at a valuation of $6.55 billion, becoming China's first publicly listed frontier AI company. The company's flagship product line is the GLM (General Language Model) family of large language models, which trace their lineage back to the GLM-130B model released in 2022. Z.ai has steadily scaled the architecture through GLM-4, GLM-4.5, and the GLM-5 generation, culminating in GLM-5.2 released June 13, 2026. The GLM-5 family is built on a 744-billion-parameter Mixture-of-Experts architecture with 40 billion active parameters per forward pass, trained on 28.5 trillion tokens. Since July 2025, Z.ai has released every GLM flagship model under the MIT license, allowing unrestricted commercial use, modification, and redistribution. Z.ai offers model access through three channels: the Z.ai consumer platform at z.ai, the Z.ai developer API at docs.z.ai, and the GLM Coding Plan subscription service offering dedicated coding-first access across Lite, Pro, Max, and Team tiers. The developer API is OpenAI-compatible, which simplifies integration for teams already using the OpenAI SDK. GLM-5.2 is also available on Fireworks AI, OpenRouter, AWS Bedrock (via bedrock/us-east-1/zai.glm-5), and Google Vertex AI (via vertex_ai/zai-org). The company has raised approximately $1.996 billion in total funding across multiple rounds prior to its IPO. This capital has supported the development of large-scale training infrastructure, a research team of approximately 2,000 employees primarily based in Beijing, and international expansion efforts. Key investors include state-backed funds and strategic technology partners in China. Z.ai's research philosophy centers on the intersection of academic research and product deployment. The company maintains close ties to Tsinghua University and has co-authored multiple peer-reviewed papers on large language model architectures, training methodologies, and evaluation frameworks. The GLM-5 technical report, titled "GLM-5: From Vibe Coding to Agentic Engineering," was published on arXiv in February 2026 and documented the architecture, training process, and benchmark evaluations in detail. On safety, Z.ai publishes model cards for its flagship releases and includes alignment training through supervised fine-tuning and reinforcement learning from human feedback. The company has not published a formal responsible scaling policy analogous to Anthropic's RSP, but its MIT licensing strategy is a meaningful openness commitment that allows external researchers to audit and evaluate model behavior. The GLM-5.2 release marked a significant technical milestone: a 1-million-token context window in an open-weights model, combined with dual thinking modes (High and Max effort) and 131,072 max output tokens. This makes GLM-5.2 the first MIT-licensed model to match or exceed proprietary frontier models on long-horizon coding tasks. On Terminal-Bench 2.1, GLM-5.2 scored 81.0, and on SWE-bench Pro it reached 62.1, both improvements of more than 3 points over the predecessor GLM-5.1. Z.ai competes directly with Qwen (Alibaba), DeepSeek (DeepSeek Ltd), MiniMax, and Baidu's ERNIE in the Chinese AI market, while internationally targeting use cases dominated by Anthropic's Claude, OpenAI's GPT series, and Google's Gemini. The company's MIT licensing strategy and OpenAI-compatible API are deliberate moves to appeal to developers who want frontier-class Chinese model performance without vendor lock-in or export risk. Looking ahead, Z.ai has indicated continued focus on coding and agentic capabilities, with the GLM roadmap targeting improvements in multi-file repository-level code editing, multi-step browser agent tasks, and expanded multimodal capabilities through the GLM-5V series. The company's position as the first Chinese frontier lab to complete a public listing gives it both capital access and the corporate governance transparency demanded by institutional investors.
Mission
To build and openly distribute frontier AI models that serve developers and enterprises worldwide, starting with the MIT-licensed GLM family.
Products
- GLM-5.2
- Z.ai Developer API
- GLM Coding Plan
- GLM-5V-Turbo
Links
Website · GitHub · Twitter · LinkedIn · Blog · Docs
Frequently Asked Questions
What is Z.ai and what does it build?
Z.ai, formerly known as Zhipu AI, is an artificial intelligence company founded in 2019 by researchers from Tsinghua University's Knowledge Engineering Group in Beijing, China. The company builds and open-sources the GLM (General Language Model) family of large language models. Its flagship model as of June 2026 is GLM-5.2, a 744-billion-parameter Mixture-of-Experts model with a 1-million-token context window released under the MIT license. Z.ai offers access through its developer API (OpenAI-compatible at $1.40/1M input tokens), the GLM Coding Plan subscription, and third-party providers including Fireworks AI, AWS Bedrock, and Google Vertex AI. The company became China's first publicly listed frontier AI company when it IPO'd on the Hong Kong Stock Exchange in January 2026 at a $6.55 billion valuation.
How does Z.ai compare to DeepSeek and Qwen?
Z.ai (GLM-5.2), DeepSeek, and Qwen (Alibaba) represent the three dominant Chinese open-weights frontier model families as of mid-2026. GLM-5.2 distinguishes itself with the largest context window of the three at 1 million tokens, compared to DeepSeek V3's 128K and Qwen3-Max's 128K contexts. On coding benchmarks, GLM-5.2 scores 62.1 on SWE-bench Pro and 81.0 on Terminal-Bench 2.1. DeepSeek R2 leads on pure reasoning benchmarks, while Qwen3-Max is stronger on multilingual non-English tasks. All three families use MIT or Apache 2.0 licenses, making them equally permissive for commercial use. Z.ai's key advantage is the 1M-context specialization and public-company transparency from its Hong Kong IPO. DeepSeek's key advantage is its extreme cost efficiency. Qwen's key advantage is breadth of model sizes from 0.5B to 235B.
Is Z.ai's GLM family truly open source?
Yes, Z.ai has released all GLM-5-generation models, including GLM-5.2, under the MIT license since July 2025. The MIT license is one of the most permissive open-source licenses: it allows free commercial use, modification, redistribution, and sublicensing without restriction. Model weights for GLM-5.2 are available for download from HuggingFace at the zai-org/GLM-5.2 repository in FP16, FP8, NVFP4, and GGUF quantization formats. VRAM requirements range from approximately 241 GB for 2-bit quantized inference to 459 GB+ for NVFP4. This makes GLM-5.2 the first MIT-licensed model with a 1-million-token context window at frontier coding performance. The openness is a deliberate competitive strategy to attract global developer adoption and differentiate from proprietary API-only competitors like Anthropic and OpenAI.
What are Z.ai's hardware requirements to self-host GLM-5.2?
Self-hosting GLM-5.2 requires significant GPU infrastructure due to its 744-billion total parameter count. For the lightest 2-bit dynamic quantization (UD-IQ2_XXS GGUF format), approximately 241 GB of VRAM or unified memory is needed, which fits on an M4 Ultra Mac Studio with 256 GB or equivalent. For 4-bit quantization (Q4_K_M), approximately 476 GB VRAM is required, necessitating multi-GPU setups such as 2x A100 80 GB or 4x RTX 6000 Ada. Full-precision NVFP4 deployment requires at minimum 6 GPUs with 96 GB each just to hold weights, with 8 GPUs recommended for KV cache headroom during 1M-context inference. vLLM is the recommended inference backend and supports tensor parallelism across multi-GPU setups. For most teams, the managed API at $1.40/1M input tokens is far more cost-effective than on-premise infrastructure unless data sovereignty requires local deployment.