Chinese model knowledge library
Chinese AI model explainers, evaluations and hot-topic briefings
APItopic is an English-first editorial section about domestic Chinese large models. It covers model knowledge, technical explainers, hands-on evaluations and fast-moving discussion around DeepSeek, Kimi, Qwen, GLM, Doubao, MiniMax, ERNIE, Hunyuan and the rest of the China model ecosystem.
Video + coverDeepSeek vs Kimi vs Qwen: a daily-use comparison
The practical answer is: choose DeepSeek for code and creative writing, Kimi for papers, reports and long documents, and Qwen when office-file handling, meeting transcripts and Alibaba ecosystem convenience matter most.
Coverage
DeepSeek Chat, Kimi K2
Read article ->
Video + coverChinese model capability evaluation: GLM, DeepSeek, MiniMax, Kimi, Qwen and MiMo
The evaluation argues that leading Chinese models have entered the global first tier: GLM-5.1, DeepSeek V4 Pro, MiMo-V2.5-Pro, Kimi K2.6 and Qwen3.6 Max are compared through agentic ability, coding-agent performance, price and practical OpenClaw-style usage.
Coverage
DeepSeek Chat, Qwen Plus, Kimi K2, GLM-4 Flash
Read article ->
Video + coverChinese LLM logic benchmark: April 2026 monthly ranking
The analysis uses a personal rolling benchmark built around private Chinese tasks. It tracks logic, math, coding, instruction following and human-intuition problems, then warns readers not to worship any leaderboard without testing models against their own needs.
Coverage
DeepSeek Chat, Qwen Plus, Kimi K2, Doubao Seed
Read article ->
Video + coverWell-known Chinese and global AI models and applications by category
This SmarToken catalog tracks well-known global and Chinese AI models by where the teams are based, whether the release is proprietary or open, and application categories such as general LLMs, reasoning, image generation, video, music, audio and world-generation models. It was last updated on June 1, 2026.
Coverage
DeepSeek Chat, Qwen Plus, Kimi K2, Doubao Seed
Read article ->
Video + cover2025 AI model annual review: from chat assistants to productivity agents
The central point is that 2025 was the year large models moved from text assistants toward productivity agents. Reasoning became normal, long context became a basic expectation, native multimodal models replaced stitched-together toolchains, and real tests shifted from benchmark trivia to work-like tasks such as fact checking, logic, visual understanding, creative planning and code generation.
Coverage
DeepSeek Chat, Qwen Plus, Kimi K2, GLM-4 Flash
Read article ->
Video + coverDeepSeek V4 Pro evaluation: scenario fit over parameter racing
DeepSeek V4 Pro is best read as a production-fit release, not only a parameter race. The headline claims are 1M context, lower long-context cost, sparse attention, a Pro and Flash split, and stronger agent/coding behavior. The hands-on results are more balanced: V4 Pro looks useful and more polished, but the right conclusion is scenario fit, not automatic victory.
Coverage
DeepSeek Chat, Qwen Plus, Kimi K2, GLM-4 Flash
Read article ->
Video + coverDeepSeek V4 technical report: 484 days of architecture work
DeepSeek V4 through two main stories: 1M context made open and efficient, and an architecture stack built to make that possible. The key technical pieces are mHC for stable residual flow, hybrid compressed attention for long context, Muon as a main optimizer, and a training pipeline that openly describes both elegant methods and messy engineering compromises.
Coverage
DeepSeek Chat, Kimi K2, Qwen Plus
Read article ->
Video + coverKimi K2.6: open-source code model and agent swarm upgrade
Kimi K2.6 is positioned as Moonshot AI's strongest code and agent model so far. Its main claims are long-horizon coding, stronger web and design generation, a larger agent-swarm architecture, better autonomous operation with OpenClaw and Hermes-style frameworks, and new office-skill workflows. This page reads the release as a workflow story: code first, agent orchestration second, office productivity third.
Coverage
Kimi K2, DeepSeek Chat, Qwen Plus, GLM-4 Flash
Read article ->
Video + coverIQuest-Coder-V1: 40B code model, Loop architecture and SWE-Bench hype
IQuest-Coder-V1 is presented as a surprising open code-model release from Ubiquant, a Beijing quantitative-investment firm. The headline is a 40B model that reports strong SWE-Bench Verified performance, supports 128K context, offers Instruct and Thinking variants, and explores a Loop architecture for better parameter use. The practical reading is cautious: the model looks important, but benchmark claims and demo cases need independent workflow testing.
Coverage
Qwen Plus, DeepSeek Chat, Kimi K2
Read article ->
Video + coverQwen3.5: native multimodal agent architecture for developers
Qwen3.5 as a native multimodal agent model, starting with Qwen3.5-397B-A17B open weights. Its main story is a hybrid architecture that combines Gated Delta Networks with sparse MoE, activates 17B parameters per forward pass out of 397B total, expands language coverage to 201 languages and scales RL environments for agent ability. This page reads it as an infrastructure release for multimodal developers, not only a chat-model update.
Coverage
Qwen Plus, Kimi K2, DeepSeek Chat, Doubao Seed
Read article ->
Video + coverBest free AI writing tools in 2026: release-focused comparison
This page compares six free or free-tier AI writing tools by writing quality, long-document handling, freshness, workflow fit and document output. Its useful conclusion is scenario-based: Doubao is frictionless for daily copy, Kimi is strongest when you upload reference material, Qwen feels natural for Chinese workplace writing, ERNIE helps when freshness matters, Tencent Yuanbao is useful for deeper reasoning drafts, and EasyClaw adds document formatting after writing.
Coverage
Doubao Seed, Kimi K2, Qwen Plus, ERNIE 4 Turbo
Read article ->
Video + coverMiniMax M2.7 evaluation: self-evolution and engineering delivery
The central point is that MiniMax M2.7 is no longer just a generation model. Its main claim is self-evolution: analyze failed paths, plan changes, execute, verify and iterate. The hands-on tests show better engineering completeness than M2.5 in logic, SVG generation, Three.js simulation and system-style UI tasks. This page reads it as a low-cost first-tier candidate that still needs task-specific verification.
Coverage
MiniMax M2.7, DeepSeek Chat, Qwen Plus, Kimi K2
Read article ->
Video + coverDoubao token usage: 120 trillion daily tokens and the AI cloud war
120 trillion daily tokens is treated as a signal that AI has moved from chat demos to real cloud consumption. It says the surge comes mainly from AI video generation and agent workflows, where tool calls, multimodal inputs and long-running tasks burn far more tokens than simple chat. This page reads the release as a token-economy analysis: token volume is becoming a cloud usage metric, not only a model-side billing unit.
Coverage
Doubao Seed, Hunyuan TurboS, Qwen Plus, DeepSeek Chat
Read article ->
Video + coverZhipu Qingyan GLM-4 review: free AI assistant and agent workflow
The reviews Zhipu Qingyan as a free AI assistant built on GLM-4. Its useful structure is practical: text writing, logic, math, coding, fresh search, long-document reading, image generation and custom agents. the review is treated as a workflow guide rather than a universal 'best tool' claim: Qingyan is most interesting where Chinese-language work, document analysis and low-barrier agent creation meet.
Coverage
GLM-4 Flash, Qwen Plus, Kimi K2, Doubao Seed
Read article ->
Video + coverStepFun Step 3.5 Flash: speed, funding and AI-terminal strategy
The central point is that StepFun has entered China's first-tier AI model race through three signals: Step 3.5 Flash is fast enough for agent workloads, the company has major new financing and leadership depth, and its commercial strategy focuses on native multimodal AI for terminals such as phones and cars. This page reads the release as a strategy brief, not only a model benchmark report.
Coverage
Qwen Plus, Doubao Seed, GLM-4 Flash, Kimi K2
Read article ->
Video + coverKimi K2.6 architecture: native multimodal agent and open deployment
This page reads Kimi K2.6 from the architecture and open-deployment side. It highlights a trillion-parameter MoE design with 32B active parameters per pass, 256K context, MoonViT visual encoding, native multimodal fusion, INT4 quantization, thinking and instant modes, API compatibility and deployment through vLLM or SGLang. This page separates this page from the Kimi release page by focusing on how K2.6 is built and deployed.
Coverage
Kimi K2, Qwen Plus, DeepSeek Chat, GLM-4 Flash
Read article ->
Video + coverMiniMax M2.7: agent harnesses, SRE tasks and self-evolution
This page frames MiniMax M2.7 as a cowork-agent release rather than a normal chat-model update. Its strongest themes are instruction following across many skills, native multi-agent teams, SRE-style debugging, Office workflow execution, role-play memory and the ability to build or improve its own agent harness. This page reads the release as a shift from using tools to shaping the tool environment itself.
Coverage
MiniMax M2.7, Kimi K2, DeepSeek Chat, Qwen Plus
Read article ->
Video + coverElephant Alpha: a 100B token-efficient work model from Inclusion AI
the mysterious Elephant Alpha model as coming from Ant Group's Inclusion AI team. It describes a 100B model with a 256K context window and 32K output that is optimized for fast, concise work. In hands-on tests, this page emphasizes bug fixing, meeting-summary extraction and lightweight agent loops. This page reads Elephant as a useful reminder that token efficiency can be a product feature, not only a cost metric.
Coverage
Qwen Plus, DeepSeek Chat, Kimi K2, GLM-4 Flash
Read article ->
Video + coverTencent Hunyuan Hy3 preview: agent rebuild, fast-slow thinking and real-world gaps
Hunyuan Hy3 preview as Tencent's first model answer after Shunyu Yao rebuilt the Hunyuan research system. It is a 295B-total-parameter MoE model with 21B active parameters, a 256K context window and a fast-slow thinking design aimed at agents. The hands-on tests are balanced: Hy3 preview shows clear ReAct-style planning and tool routing, but still struggles with data reliability and complete final deliverables.
Coverage
Hunyuan TurboS, Qwen Plus, DeepSeek Chat, GLM-4 Flash
Read article ->
Video + coverDigitalOcean DeepSeek V3.2 inference speed: what the engineering claims mean
DigitalOcean Serverless Inference reached very high output speed for DeepSeek V3.2 on Artificial Analysis, with 230 tokens per second at 10K input tokens and sub-second TTFT. The useful Reading is engineering, not just leaderboard heat: hardware, NVFP4 quantization, vLLM tuning, kernel fusion, speculative decoding and customer workload economics all have to work together.
Coverage
DeepSeek Chat, Qwen Plus, MiniMax M2.7, Kimi K2
Read article ->
Video + coverDeepSeek V4: Flash, Pro, 1M context and open infrastructure
DeepSeek V4 is positioned as an infrastructure-model release. Both V4-Flash and V4-Pro are described as supporting 1M context, while Flash targets low-latency high-frequency use and Pro targets stronger reasoning, coding and agent tasks. The practical takeaway is route design: use Flash for cheap fast calls, Pro for high-value work and verify long-context grounding before replacing RAG.
Coverage
DeepSeek Chat, Qwen Plus, Kimi K2, Hunyuan TurboS
Read article ->
Video + coverKimi K2.5: vision, code, Office skills and agent clusters
Kimi K2.5 is presented as Moonshot's most versatile open model at that point: native vision and text input, thinking and non-thinking modes, code generation, Office skills and an experimental Agent cluster mode. This page reads K2.5 as a bridge release between single-agent Kimi workflows and later larger agent-swarm releases.
Coverage
Kimi K2, Qwen Plus, DeepSeek Chat, GLM-4 Flash
Read article ->
Video + coverDeepSeek and Kimi: how China's open models are compounding
The central point is that DeepSeek and Kimi are no longer isolated success stories. Their open model releases, architecture choices and citations are starting to compound: Kimi uses DeepSeek-style MLA, DeepSeek V4 uses Muon ideas validated at scale by Kimi, and both are pushing long context, KV-cache engineering and domestic hardware paths. This page reads the piece as an open-source ecosystem story, not only a rivalry.
Coverage
DeepSeek Chat, Kimi K2, Qwen Plus, MiniMax M2.7
Read article ->
Video + coverZhipu Qingyan vs KimiChat: a workplace assistant reading
This page is enthusiastic about Zhipu Qingyan as a workplace-friendly Chinese AI assistant. It compares the product landscape loosely, then focuses on Qingyan's agent builder, image generation, long-document reading, data analysis and web search. The page keeps the practical workflow view while adding a freshness caution because the material was first published in 2024 and product features have likely changed.
Coverage
GLM-4 Flash, Kimi K2, Qwen Plus, Doubao Seed
Read article ->
Video + coverZhipu GLM-5 Scaling Pain: KV cache, speculative decoding and agent serving
This page summarizes Zhipu's unusually candid technical post about GLM-5 serving failures under high-load coding-agent traffic. The problem was not simple model quality. It involved inference-state management: KV-cache races in PD-disaggregated serving, read-before-ready timing in HiCache and monitoring signals from speculative decoding. This page reads it as a reminder that scaling intelligence also means scaling the serving system.
Coverage
GLM-4 Flash, DeepSeek Chat, Qwen Plus, MiniMax M2.7
Read article ->
Video + coverDoubao Seed 2.0: multimodal understanding, agent work and coding
This page frames Doubao Seed 2.0 as ByteDance's major 2.0 model step after strong visual releases such as Seedance 2.0 and Seedream 5.0 Lite. The main claims are stronger multimodal understanding, enterprise-grade agent skills, coding, math and more efficient reasoning. This page reads the release as a workflow upgrade story: from consumer visual fun to production-shaped coding, data and agent tasks.
Coverage
Doubao Seed, Qwen Plus, Kimi K2, DeepSeek Chat
Read article ->
Video + coverAlibaba Cloud Bailian model catalog: what the platform covers
This is a catalog-style overview of what Alibaba Cloud Bailian supports. It lists Qwen, Wanxiang, DeepSeek, Kimi, GLM, Llama, Baichuan and MiniMax-style access, then groups capabilities by text generation, multimodal, image, speech, video, embeddings and industry models. This page reads it as a platform taxonomy rather than a model ranking.
Coverage
Qwen Plus, DeepSeek Chat, Kimi K2, GLM-4 Flash
Read article ->
Video + coverZStack AIOS and DeepSeek V4: private deployment for enterprise AI
ZStack AIOS supports DeepSeek V4-Pro and V4-Flash for private deployment, including domestic AI-chip support and enterprise controls. This page reads the release as a private-AI deployment checklist: compute scheduling, model serving, long-context optimization, RAG, operations, multi-tenancy and compliance matter as much as model capability.
Coverage
DeepSeek Chat, Hunyuan TurboS, Qwen Plus, GLM-4 Flash
Read article ->
Video + coverQwen3.6-35B-A3B: a sparse MoE coding agent model
Qwen3.6-35B-A3B is presented as a small-active-parameter MoE model aimed at agentic coding and multimodal tasks. It has 35B total parameters and about 3B active parameters, supports thinking and non-thinking modes, and is released through Qwen Studio, Hugging Face, ModelScope and Bailian API as qwen3.6-flash. This page reads it as an efficiency story: sparse activation can make strong coding agents cheaper to run.
Coverage
Qwen Plus, DeepSeek Chat, Kimi K2, GLM-4 Flash
Read article ->
Video + coverDeepSeek V4 as a strategic threat: open models, cost and control
The central point is that DeepSeek V4 is strategically important because it is open, close to frontier capability and much cheaper than leading closed models in many enterprise scenarios. This page reads it as a cost-and-control problem: if companies can run or fine-tune a strong Chinese open model, closed-model vendors must compete on price, capability, trust and deployment control at the same time.
Coverage
DeepSeek Chat, Kimi K2, Qwen Plus, GLM-4 Flash
Read article ->
Video + coverQwen3.5-Omni: all-modal audio, video and vibe coding
Qwen3.5-Omni is presented as an all-modal model for text, image, audio, video, speech and real-time interaction. It highlights 215 reported SOTA tasks, 113-language speech recognition, 36-language speech generation, long audio/video understanding and audio-video vibe coding. This page reads it as a workflow-expansion release: voice, camera and video become direct inputs for code, content operations and enterprise assistants.
Coverage
Qwen Plus, Kimi K2, DeepSeek Chat, Doubao Seed
Read article ->
Video + coverTencent Yuanqi: a zero-code asset inventory agent workflow
A practical no-code agent workflow: a user uploads an asset barcode image, Tencent Yuanqi extracts the image URL, an image-understanding plugin reads the barcode, Hunyuan Turbo extracts the asset number, and a knowledge base returns asset information. It is treated as a field-operations example: agent builders become useful when they connect model perception, structured extraction, internal data and a familiar mobile entry point.
Coverage
Hunyuan TurboS, DeepSeek Chat, Qwen Plus, GLM-4 Flash
Read article ->
Video + coverGLM-4.7-Flash: MLA, 3B active parameters and local agent use
GLM-4.7-Flash is presented as a lightweight open model for local coding and agent assistants: 30B total parameters, about 3B active parameters, 200K context and first-time GLM use of DeepSeek-style MLA. This page reads it as an efficiency and deployment story: small-active MoE plus MLA can make local and low-cost agent workflows more realistic, but throughput, latency and current pricing still need verification.
Coverage
GLM-4 Flash, DeepSeek Chat, Qwen Plus, Kimi K2
Read article ->