DeepSeek
DeepSeek Chat
deepseek-chat
Hot
Cost-efficient Chinese frontier chat and coding model for production assistants, agents and high-volume API workloads.
- Context
- 128K
- Speed
- Balanced
- Region
- China
- Latency
- Good for interactive chat
- Billable input $/1M
- $0.6
- Billable output $/1M
- $1.8
Chinacodingreasoningpopular
reasoningcodingjsonlow cost
Moonshot AI
Kimi K2
kimi-k2
Hot
Long-context and agentic model from Moonshot AI, useful for research, codebase reading and complex multi-step workflows.
- Context
- 256K
- Speed
- Balanced
- Region
- China
- Latency
- Best for long-context work
- Billable input $/1M
- $0.72
- Billable output $/1M
- $3
Chinalong contextagents
reasoningcodingtool callinglong context
Alibaba Cloud
Qwen Plus
qwen-plus
Hot
Alibaba's broad model family is popular for multilingual apps, structured output, coding and open ecosystem coverage.
- Context
- 128K
- Speed
- Fast
- Region
- China
- Latency
- Strong default for apps
- Billable input $/1M
- $1.44
- Billable output $/1M
- $5.76
Chinamultilingualopen ecosystem
reasoningcodingjsonmultilingualtool calling
Z.ai
GLM-4 Flash
glm-4-flash
Enterprise-friendly Chinese model route with good reasoning, coding and low-latency deployment options.
- Context
- 128K
- Speed
- Fast
- Region
- China
- Latency
- Good for quick responses
- Billable input $/1M
- $0.96
- Billable output $/1M
- $3.84
Chinaenterprisecoding
reasoningcodingfastjson
ByteDance
Doubao Seed
doubao-seed
New
ByteDance model family route for conversational, multimodal and consumer-facing workloads at scale.
- Context
- 128K
- Speed
- Fast
- Region
- China
- Latency
- Good for consumer apps
- Billable input $/1M
- $0.84
- Billable output $/1M
- $3.36
Chinamultimodalconsumer
fastmultilingualvisionjson
Baidu
ERNIE 4 Turbo
ernie-4-turbo
Baidu's enterprise model line is useful for Chinese-language knowledge, search-adjacent and business workflows.
- Context
- 128K
- Speed
- Balanced
- Region
- China
- Latency
- Good for knowledge tasks
- Billable input $/1M
- $1.08
- Billable output $/1M
- $4.32
Chinaenterpriseknowledge
reasoningmultilingualjson
Tencent Cloud
Hunyuan TurboS
hunyuan-turbos
Hot
Tencent Hunyuan route for Chinese content creation, logic reasoning, code generation and multi-turn dialogue.
- Context
- 128K
- Speed
- Fast
- Region
- China
- Latency
- Good for interactive products
- Billable input $/1M
- $0.96
- Billable output $/1M
- $3.84
ChinaTencentOpenAI compatiblereasoning
reasoningcodingtool callingjsonfast
MiniMax
MiniMax M2.7
minimax-m2
Hot
MiniMax M2.7-style route for agentic coding, long-running developer workflows and multimodal platform evaluation.
- Context
- 128K
- Speed
- Balanced
- Region
- China
- Latency
- Good for agent tasks
- Billable input $/1M
- $0.36
- Billable output $/1M
- $1.44
Chinaagentic codingdeveloper planlow cost
reasoningcodingtool callingjsonlow cost
StepFun model route for developer agents, code tools and cost-controlled high-frequency usage through Step-compatible APIs.
- Context
- 128K
- Speed
- Balanced
- Region
- China
- Latency
- Good for coding tools
- Billable input $/1M
- $0.6
- Billable output $/1M
- $2.4
ChinacodingagentsStepFun
reasoningcodingtool callinglong contextjson
Baichuan AI
Baichuan4 Turbo
baichuan4-turbo
Baichuan route for Chinese enterprise assistants, healthcare-adjacent knowledge workflows and bilingual business tasks.
- Context
- 32K
- Speed
- Balanced
- Region
- China
- Latency
- Good for domain QA
- Billable input $/1M
- $1.8
- Billable output $/1M
- $1.8
ChinaenterpriseknowledgeBaichuan
reasoningjsonmultilingual
iFLYTEK Spark route for deep reasoning, Chinese-language productivity, education, speech-adjacent and enterprise scenarios.
- Context
- 64K
- Speed
- Balanced
- Region
- China
- Latency
- Good for reasoning workflows
- Billable input $/1M
- $0.84
- Billable output $/1M
- $3.36
ChinaSparkreasoningeducation
reasoningcodingmultilingualtool calling
SenseTime
SenseNova V6
sensenova-v6
SenseNova route for multimodal reasoning, enterprise visual workflows and text-generation scenarios from SenseTime.
- Context
- 128K
- Speed
- Balanced
- Region
- China
- Latency
- Good for multimodal evaluation
- Billable input $/1M
- $0.96
- Billable output $/1M
- $3.84
ChinamultimodalvisionSenseTime
reasoningvisiontool callingjsonmultilingual
Huawei Cloud
Pangu NLP
pangu-nlp
Huawei Pangu route for enterprise NLP, industry model applications and Huawei Cloud ModelArts-style deployments.
- Context
- 32K
- Speed
- Enterprise
- Region
- China
- Latency
- Best for private or enterprise routes
- Billable input $/1M
- $1.2
- Billable output $/1M
- $4.8
ChinaHuaweiindustryenterprise
reasoningjsonmultilingual
Qihoo 360
360 Zhinao
360zhinao2-o1
360 Zhinao route for Chinese reasoning, security-adjacent assistants and long-context experimentation when configured upstream.
- Context
- 32K
- Speed
- Balanced
- Region
- China
- Latency
- Good for Chinese assistants
- Billable input $/1M
- $0.48
- Billable output $/1M
- $1.2
China360reasoningsecurity
reasoningjsonlong context
01.AI Yi route for bilingual generation, structured outputs and legacy Yi-family compatibility through supported upstream providers.
- Context
- 32K
- Speed
- Balanced
- Region
- China
- Latency
- Good for bilingual text
- Billable input $/1M
- $1.08
- Billable output $/1M
- $1.08
Chinabilingualopen ecosystemYi
reasoningcodingjsonmultilingual
Shanghai AI Laboratory
InternLM3
internlm3
InternLM route for open-source Chinese model evaluation, self-hosted deployments and research-friendly application testing.
- Context
- 200K
- Speed
- Self-hosted
- Region
- China
- Latency
- Depends on deployment
- Billable input $/1M
- $0.24
- Billable output $/1M
- $0.96
Chinaopen sourceresearchlong context
reasoningcodingjsonlong contextlow cost
Meituan
LongCat Flash
longcat-flash
New
Meituan LongCat route for open multimodal and reasoning model evaluation, including Flash chat and thinking-style variants.
- Context
- 128K
- Speed
- Fast
- Region
- China
- Latency
- Good for high-throughput tests
- Billable input $/1M
- $0.48
- Billable output $/1M
- $1.92
ChinaMeituanopen sourcemultimodal
reasoningcodingvisiontool callingfast