Chinese model knowledge library

Chinese AI model explainers, evaluations and hot-topic briefings

APItopic is an English-first editorial section about domestic Chinese large models. It covers model knowledge, technical explainers, hands-on evaluations and fast-moving discussion around DeepSeek, Kimi, Qwen, GLM, Doubao, MiniMax, ERNIE, Hunyuan and the rest of the China model ecosystem.

DeepSeek vs Kimi vs Qwen: a daily-use comparison cover imageVideo + cover
Evaluation8 min readUpdated 2026-05-25

DeepSeek vs Kimi vs Qwen: a daily-use comparison

The practical answer is: choose DeepSeek for code and creative writing, Kimi for papers, reports and long documents, and Qwen when office-file handling, meeting transcripts and Alibaba ecosystem convenience matter most.

Coverage

DeepSeek Chat, Kimi K2

Read article ->

Chinese model capability evaluation: GLM, DeepSeek, MiniMax, Kimi, Qwen and MiMo cover imageVideo + cover
Evaluation8 min readUpdated 2026-05-25

Chinese model capability evaluation: GLM, DeepSeek, MiniMax, Kimi, Qwen and MiMo

The evaluation argues that leading Chinese models have entered the global first tier: GLM-5.1, DeepSeek V4 Pro, MiMo-V2.5-Pro, Kimi K2.6 and Qwen3.6 Max are compared through agentic ability, coding-agent performance, price and practical OpenClaw-style usage.

Coverage

DeepSeek Chat, Qwen Plus, Kimi K2, GLM-4 Flash

Read article ->

Chinese LLM logic benchmark: April 2026 monthly ranking cover imageVideo + cover
Evaluation8 min readUpdated 2026-05-25

Chinese LLM logic benchmark: April 2026 monthly ranking

The analysis uses a personal rolling benchmark built around private Chinese tasks. It tracks logic, math, coding, instruction following and human-intuition problems, then warns readers not to worship any leaderboard without testing models against their own needs.

Coverage

DeepSeek Chat, Qwen Plus, Kimi K2, Doubao Seed

Read article ->

Well-known Chinese and global AI models and applications by category cover imageVideo + cover
Evaluation8 min readUpdated 2026-05-25

Well-known Chinese and global AI models and applications by category

This SmarToken catalog tracks well-known global and Chinese AI models by where the teams are based, whether the release is proprietary or open, and application categories such as general LLMs, reasoning, image generation, video, music, audio and world-generation models. It was last updated on June 1, 2026.

Coverage

DeepSeek Chat, Qwen Plus, Kimi K2, Doubao Seed

Read article ->

2025 AI model annual review: from chat assistants to productivity agents cover imageVideo + cover
Evaluation8 min readUpdated 2026-05-25

2025 AI model annual review: from chat assistants to productivity agents

The central point is that 2025 was the year large models moved from text assistants toward productivity agents. Reasoning became normal, long context became a basic expectation, native multimodal models replaced stitched-together toolchains, and real tests shifted from benchmark trivia to work-like tasks such as fact checking, logic, visual understanding, creative planning and code generation.

Coverage

DeepSeek Chat, Qwen Plus, Kimi K2, GLM-4 Flash

Read article ->

DeepSeek V4 Pro evaluation: scenario fit over parameter racing cover imageVideo + cover
Evaluation8 min readUpdated 2026-05-25

DeepSeek V4 Pro evaluation: scenario fit over parameter racing

DeepSeek V4 Pro is best read as a production-fit release, not only a parameter race. The headline claims are 1M context, lower long-context cost, sparse attention, a Pro and Flash split, and stronger agent/coding behavior. The hands-on results are more balanced: V4 Pro looks useful and more polished, but the right conclusion is scenario fit, not automatic victory.

Coverage

DeepSeek Chat, Qwen Plus, Kimi K2, GLM-4 Flash

Read article ->

DeepSeek V4 technical report: 484 days of architecture work cover imageVideo + cover
Model explainer7 min readUpdated 2026-05-25

DeepSeek V4 technical report: 484 days of architecture work

DeepSeek V4 through two main stories: 1M context made open and efficient, and an architecture stack built to make that possible. The key technical pieces are mHC for stable residual flow, hybrid compressed attention for long context, Muon as a main optimizer, and a training pipeline that openly describes both elegant methods and messy engineering compromises.

Coverage

DeepSeek Chat, Kimi K2, Qwen Plus

Read article ->

Kimi K2.6: open-source code model and agent swarm upgrade cover imageVideo + cover
Model explainer7 min readUpdated 2026-05-25

Kimi K2.6: open-source code model and agent swarm upgrade

Kimi K2.6 is positioned as Moonshot AI's strongest code and agent model so far. Its main claims are long-horizon coding, stronger web and design generation, a larger agent-swarm architecture, better autonomous operation with OpenClaw and Hermes-style frameworks, and new office-skill workflows. This page reads the release as a workflow story: code first, agent orchestration second, office productivity third.

Coverage

Kimi K2, DeepSeek Chat, Qwen Plus, GLM-4 Flash

Read article ->

IQuest-Coder-V1: 40B code model, Loop architecture and SWE-Bench hype cover imageVideo + cover
Model explainer7 min readUpdated 2026-05-25

IQuest-Coder-V1: 40B code model, Loop architecture and SWE-Bench hype

IQuest-Coder-V1 is presented as a surprising open code-model release from Ubiquant, a Beijing quantitative-investment firm. The headline is a 40B model that reports strong SWE-Bench Verified performance, supports 128K context, offers Instruct and Thinking variants, and explores a Loop architecture for better parameter use. The practical reading is cautious: the model looks important, but benchmark claims and demo cases need independent workflow testing.

Coverage

Qwen Plus, DeepSeek Chat, Kimi K2

Read article ->

Qwen3.5: native multimodal agent architecture for developers cover imageVideo + cover
Model explainer7 min readUpdated 2026-05-25

Qwen3.5: native multimodal agent architecture for developers

Qwen3.5 as a native multimodal agent model, starting with Qwen3.5-397B-A17B open weights. Its main story is a hybrid architecture that combines Gated Delta Networks with sparse MoE, activates 17B parameters per forward pass out of 397B total, expands language coverage to 201 languages and scales RL environments for agent ability. This page reads it as an infrastructure release for multimodal developers, not only a chat-model update.

Coverage

Qwen Plus, Kimi K2, DeepSeek Chat, Doubao Seed

Read article ->

Best free AI writing tools in 2026: release-focused comparison cover imageVideo + cover
Evaluation8 min readUpdated 2026-05-25

Best free AI writing tools in 2026: release-focused comparison

This page compares six free or free-tier AI writing tools by writing quality, long-document handling, freshness, workflow fit and document output. Its useful conclusion is scenario-based: Doubao is frictionless for daily copy, Kimi is strongest when you upload reference material, Qwen feels natural for Chinese workplace writing, ERNIE helps when freshness matters, Tencent Yuanbao is useful for deeper reasoning drafts, and EasyClaw adds document formatting after writing.

Coverage

Doubao Seed, Kimi K2, Qwen Plus, ERNIE 4 Turbo

Read article ->

MiniMax M2.7 evaluation: self-evolution and engineering delivery cover imageVideo + cover
Evaluation8 min readUpdated 2026-05-25

MiniMax M2.7 evaluation: self-evolution and engineering delivery

The central point is that MiniMax M2.7 is no longer just a generation model. Its main claim is self-evolution: analyze failed paths, plan changes, execute, verify and iterate. The hands-on tests show better engineering completeness than M2.5 in logic, SVG generation, Three.js simulation and system-style UI tasks. This page reads it as a low-cost first-tier candidate that still needs task-specific verification.

Coverage

MiniMax M2.7, DeepSeek Chat, Qwen Plus, Kimi K2

Read article ->

Doubao token usage: 120 trillion daily tokens and the AI cloud war cover imageVideo + cover
Model explainer7 min readUpdated 2026-05-25

Doubao token usage: 120 trillion daily tokens and the AI cloud war

120 trillion daily tokens is treated as a signal that AI has moved from chat demos to real cloud consumption. It says the surge comes mainly from AI video generation and agent workflows, where tool calls, multimodal inputs and long-running tasks burn far more tokens than simple chat. This page reads the release as a token-economy analysis: token volume is becoming a cloud usage metric, not only a model-side billing unit.

Coverage

Doubao Seed, Hunyuan TurboS, Qwen Plus, DeepSeek Chat

Read article ->

Zhipu Qingyan GLM-4 review: free AI assistant and agent workflow cover imageVideo + cover
Model explainer7 min readUpdated 2026-05-25

Zhipu Qingyan GLM-4 review: free AI assistant and agent workflow

The reviews Zhipu Qingyan as a free AI assistant built on GLM-4. Its useful structure is practical: text writing, logic, math, coding, fresh search, long-document reading, image generation and custom agents. the review is treated as a workflow guide rather than a universal 'best tool' claim: Qingyan is most interesting where Chinese-language work, document analysis and low-barrier agent creation meet.

Coverage

GLM-4 Flash, Qwen Plus, Kimi K2, Doubao Seed

Read article ->

StepFun Step 3.5 Flash: speed, funding and AI-terminal strategy cover imageVideo + cover
Model explainer7 min readUpdated 2026-05-25

StepFun Step 3.5 Flash: speed, funding and AI-terminal strategy

The central point is that StepFun has entered China's first-tier AI model race through three signals: Step 3.5 Flash is fast enough for agent workloads, the company has major new financing and leadership depth, and its commercial strategy focuses on native multimodal AI for terminals such as phones and cars. This page reads the release as a strategy brief, not only a model benchmark report.

Coverage

Qwen Plus, Doubao Seed, GLM-4 Flash, Kimi K2

Read article ->

Kimi K2.6 architecture: native multimodal agent and open deployment cover imageVideo + cover
Model explainer7 min readUpdated 2026-05-25

Kimi K2.6 architecture: native multimodal agent and open deployment

This page reads Kimi K2.6 from the architecture and open-deployment side. It highlights a trillion-parameter MoE design with 32B active parameters per pass, 256K context, MoonViT visual encoding, native multimodal fusion, INT4 quantization, thinking and instant modes, API compatibility and deployment through vLLM or SGLang. This page separates this page from the Kimi release page by focusing on how K2.6 is built and deployed.

Coverage

Kimi K2, Qwen Plus, DeepSeek Chat, GLM-4 Flash

Read article ->

MiniMax M2.7: agent harnesses, SRE tasks and self-evolution cover imageVideo + cover
Model explainer7 min readUpdated 2026-05-25

MiniMax M2.7: agent harnesses, SRE tasks and self-evolution

This page frames MiniMax M2.7 as a cowork-agent release rather than a normal chat-model update. Its strongest themes are instruction following across many skills, native multi-agent teams, SRE-style debugging, Office workflow execution, role-play memory and the ability to build or improve its own agent harness. This page reads the release as a shift from using tools to shaping the tool environment itself.

Coverage

MiniMax M2.7, Kimi K2, DeepSeek Chat, Qwen Plus

Read article ->

Elephant Alpha: a 100B token-efficient work model from Inclusion AI cover imageVideo + cover
Model explainer7 min readUpdated 2026-05-25

Elephant Alpha: a 100B token-efficient work model from Inclusion AI

the mysterious Elephant Alpha model as coming from Ant Group's Inclusion AI team. It describes a 100B model with a 256K context window and 32K output that is optimized for fast, concise work. In hands-on tests, this page emphasizes bug fixing, meeting-summary extraction and lightweight agent loops. This page reads Elephant as a useful reminder that token efficiency can be a product feature, not only a cost metric.

Coverage

Qwen Plus, DeepSeek Chat, Kimi K2, GLM-4 Flash

Read article ->

Tencent Hunyuan Hy3 preview: agent rebuild, fast-slow thinking and real-world gaps cover imageVideo + cover
Model explainer7 min readUpdated 2026-05-25

Tencent Hunyuan Hy3 preview: agent rebuild, fast-slow thinking and real-world gaps

Hunyuan Hy3 preview as Tencent's first model answer after Shunyu Yao rebuilt the Hunyuan research system. It is a 295B-total-parameter MoE model with 21B active parameters, a 256K context window and a fast-slow thinking design aimed at agents. The hands-on tests are balanced: Hy3 preview shows clear ReAct-style planning and tool routing, but still struggles with data reliability and complete final deliverables.

Coverage

Hunyuan TurboS, Qwen Plus, DeepSeek Chat, GLM-4 Flash

Read article ->

DigitalOcean DeepSeek V3.2 inference speed: what the engineering claims mean cover imageVideo + cover
Model explainer7 min readUpdated 2026-05-25

DigitalOcean DeepSeek V3.2 inference speed: what the engineering claims mean

DigitalOcean Serverless Inference reached very high output speed for DeepSeek V3.2 on Artificial Analysis, with 230 tokens per second at 10K input tokens and sub-second TTFT. The useful Reading is engineering, not just leaderboard heat: hardware, NVFP4 quantization, vLLM tuning, kernel fusion, speculative decoding and customer workload economics all have to work together.

Coverage

DeepSeek Chat, Qwen Plus, MiniMax M2.7, Kimi K2

Read article ->

DeepSeek V4: Flash, Pro, 1M context and open infrastructure cover imageVideo + cover
Model explainer7 min readUpdated 2026-05-25

DeepSeek V4: Flash, Pro, 1M context and open infrastructure

DeepSeek V4 is positioned as an infrastructure-model release. Both V4-Flash and V4-Pro are described as supporting 1M context, while Flash targets low-latency high-frequency use and Pro targets stronger reasoning, coding and agent tasks. The practical takeaway is route design: use Flash for cheap fast calls, Pro for high-value work and verify long-context grounding before replacing RAG.

Coverage

DeepSeek Chat, Qwen Plus, Kimi K2, Hunyuan TurboS

Read article ->

Kimi K2.5: vision, code, Office skills and agent clusters cover imageVideo + cover
Model explainer7 min readUpdated 2026-05-25

Kimi K2.5: vision, code, Office skills and agent clusters

Kimi K2.5 is presented as Moonshot's most versatile open model at that point: native vision and text input, thinking and non-thinking modes, code generation, Office skills and an experimental Agent cluster mode. This page reads K2.5 as a bridge release between single-agent Kimi workflows and later larger agent-swarm releases.

Coverage

Kimi K2, Qwen Plus, DeepSeek Chat, GLM-4 Flash

Read article ->

DeepSeek and Kimi: how China's open models are compounding cover imageVideo + cover
Evaluation8 min readUpdated 2026-05-25

DeepSeek and Kimi: how China's open models are compounding

The central point is that DeepSeek and Kimi are no longer isolated success stories. Their open model releases, architecture choices and citations are starting to compound: Kimi uses DeepSeek-style MLA, DeepSeek V4 uses Muon ideas validated at scale by Kimi, and both are pushing long context, KV-cache engineering and domestic hardware paths. This page reads the piece as an open-source ecosystem story, not only a rivalry.

Coverage

DeepSeek Chat, Kimi K2, Qwen Plus, MiniMax M2.7

Read article ->

Zhipu Qingyan vs KimiChat: a workplace assistant reading cover imageVideo + cover
Evaluation8 min readUpdated 2026-05-25

Zhipu Qingyan vs KimiChat: a workplace assistant reading

This page is enthusiastic about Zhipu Qingyan as a workplace-friendly Chinese AI assistant. It compares the product landscape loosely, then focuses on Qingyan's agent builder, image generation, long-document reading, data analysis and web search. The page keeps the practical workflow view while adding a freshness caution because the material was first published in 2024 and product features have likely changed.

Coverage

GLM-4 Flash, Kimi K2, Qwen Plus, Doubao Seed

Read article ->

Zhipu GLM-5 Scaling Pain: KV cache, speculative decoding and agent serving cover imageVideo + cover
Model explainer7 min readUpdated 2026-05-25

Zhipu GLM-5 Scaling Pain: KV cache, speculative decoding and agent serving

This page summarizes Zhipu's unusually candid technical post about GLM-5 serving failures under high-load coding-agent traffic. The problem was not simple model quality. It involved inference-state management: KV-cache races in PD-disaggregated serving, read-before-ready timing in HiCache and monitoring signals from speculative decoding. This page reads it as a reminder that scaling intelligence also means scaling the serving system.

Coverage

GLM-4 Flash, DeepSeek Chat, Qwen Plus, MiniMax M2.7

Read article ->

Doubao Seed 2.0: multimodal understanding, agent work and coding cover imageVideo + cover
Model explainer7 min readUpdated 2026-05-25

Doubao Seed 2.0: multimodal understanding, agent work and coding

This page frames Doubao Seed 2.0 as ByteDance's major 2.0 model step after strong visual releases such as Seedance 2.0 and Seedream 5.0 Lite. The main claims are stronger multimodal understanding, enterprise-grade agent skills, coding, math and more efficient reasoning. This page reads the release as a workflow upgrade story: from consumer visual fun to production-shaped coding, data and agent tasks.

Coverage

Doubao Seed, Qwen Plus, Kimi K2, DeepSeek Chat

Read article ->

Alibaba Cloud Bailian model catalog: what the platform covers cover imageVideo + cover
Model explainer7 min readUpdated 2026-05-25

Alibaba Cloud Bailian model catalog: what the platform covers

This is a catalog-style overview of what Alibaba Cloud Bailian supports. It lists Qwen, Wanxiang, DeepSeek, Kimi, GLM, Llama, Baichuan and MiniMax-style access, then groups capabilities by text generation, multimodal, image, speech, video, embeddings and industry models. This page reads it as a platform taxonomy rather than a model ranking.

Coverage

Qwen Plus, DeepSeek Chat, Kimi K2, GLM-4 Flash

Read article ->

ZStack AIOS and DeepSeek V4: private deployment for enterprise AI cover imageVideo + cover
Model explainer7 min readUpdated 2026-05-25

ZStack AIOS and DeepSeek V4: private deployment for enterprise AI

ZStack AIOS supports DeepSeek V4-Pro and V4-Flash for private deployment, including domestic AI-chip support and enterprise controls. This page reads the release as a private-AI deployment checklist: compute scheduling, model serving, long-context optimization, RAG, operations, multi-tenancy and compliance matter as much as model capability.

Coverage

DeepSeek Chat, Hunyuan TurboS, Qwen Plus, GLM-4 Flash

Read article ->

Qwen3.6-35B-A3B: a sparse MoE coding agent model cover imageVideo + cover
Model explainer7 min readUpdated 2026-05-25

Qwen3.6-35B-A3B: a sparse MoE coding agent model

Qwen3.6-35B-A3B is presented as a small-active-parameter MoE model aimed at agentic coding and multimodal tasks. It has 35B total parameters and about 3B active parameters, supports thinking and non-thinking modes, and is released through Qwen Studio, Hugging Face, ModelScope and Bailian API as qwen3.6-flash. This page reads it as an efficiency story: sparse activation can make strong coding agents cheaper to run.

Coverage

Qwen Plus, DeepSeek Chat, Kimi K2, GLM-4 Flash

Read article ->

DeepSeek V4 as a strategic threat: open models, cost and control cover imageVideo + cover
Evaluation8 min readUpdated 2026-05-25

DeepSeek V4 as a strategic threat: open models, cost and control

The central point is that DeepSeek V4 is strategically important because it is open, close to frontier capability and much cheaper than leading closed models in many enterprise scenarios. This page reads it as a cost-and-control problem: if companies can run or fine-tune a strong Chinese open model, closed-model vendors must compete on price, capability, trust and deployment control at the same time.

Coverage

DeepSeek Chat, Kimi K2, Qwen Plus, GLM-4 Flash

Read article ->

Qwen3.5-Omni: all-modal audio, video and vibe coding cover imageVideo + cover
Model explainer7 min readUpdated 2026-05-25

Qwen3.5-Omni: all-modal audio, video and vibe coding

Qwen3.5-Omni is presented as an all-modal model for text, image, audio, video, speech and real-time interaction. It highlights 215 reported SOTA tasks, 113-language speech recognition, 36-language speech generation, long audio/video understanding and audio-video vibe coding. This page reads it as a workflow-expansion release: voice, camera and video become direct inputs for code, content operations and enterprise assistants.

Coverage

Qwen Plus, Kimi K2, DeepSeek Chat, Doubao Seed

Read article ->

Tencent Yuanqi: a zero-code asset inventory agent workflow cover imageVideo + cover
Technical tutorial7 min readUpdated 2026-05-25

Tencent Yuanqi: a zero-code asset inventory agent workflow

A practical no-code agent workflow: a user uploads an asset barcode image, Tencent Yuanqi extracts the image URL, an image-understanding plugin reads the barcode, Hunyuan Turbo extracts the asset number, and a knowledge base returns asset information. It is treated as a field-operations example: agent builders become useful when they connect model perception, structured extraction, internal data and a familiar mobile entry point.

Coverage

Hunyuan TurboS, DeepSeek Chat, Qwen Plus, GLM-4 Flash

Read article ->

GLM-4.7-Flash: MLA, 3B active parameters and local agent use cover imageVideo + cover
Model explainer7 min readUpdated 2026-05-25

GLM-4.7-Flash: MLA, 3B active parameters and local agent use

GLM-4.7-Flash is presented as a lightweight open model for local coding and agent assistants: 30B total parameters, about 3B active parameters, 200K context and first-time GLM use of DeepSeek-style MLA. This page reads it as an efficiency and deployment story: small-active MoE plus MLA can make local and low-cost agent workflows more realistic, but throughput, latency and current pricing still need verification.

Coverage

GLM-4 Flash, DeepSeek Chat, Qwen Plus, Kimi K2

Read article ->

Get API Key