Curated model API catalog

Compare mainstream Chinese AI models with one OpenAI-compatible gateway

Browse DeepSeek, Kimi, Qwen, GLM, Doubao, ERNIE, Hunyuan, MiniMax, StepFun, Spark, Pangu, InternLM, LongCat and other China-first routes by capability, context window and USD token pricing. Each model page includes source links, API examples, practical FAQ and a path into the console.

This catalog focuses on mainstream model families that can be routed through API workflows, not every Chinese generative AI service filing.

Not sure where to start? Read the integration docs

DeepSeek

DeepSeek Chat

deepseek-chat

Hot

Cost-efficient Chinese frontier chat and coding model for production assistants, agents and high-volume API workloads.

Context
128K
Speed
Balanced
Region
China
Latency
Good for interactive chat
Billable input $/1M
$0.6
Billable output $/1M
$1.8
Chinacodingreasoningpopular
reasoningcodingjsonlow cost

Moonshot AI

Kimi K2

kimi-k2

Hot

Long-context and agentic model from Moonshot AI, useful for research, codebase reading and complex multi-step workflows.

Context
256K
Speed
Balanced
Region
China
Latency
Best for long-context work
Billable input $/1M
$0.72
Billable output $/1M
$3
Chinalong contextagents
reasoningcodingtool callinglong context

Alibaba Cloud

Qwen Plus

qwen-plus

Hot

Alibaba's broad model family is popular for multilingual apps, structured output, coding and open ecosystem coverage.

Context
128K
Speed
Fast
Region
China
Latency
Strong default for apps
Billable input $/1M
$1.44
Billable output $/1M
$5.76
Chinamultilingualopen ecosystem
reasoningcodingjsonmultilingualtool calling

Z.ai

GLM-4 Flash

glm-4-flash

Enterprise-friendly Chinese model route with good reasoning, coding and low-latency deployment options.

Context
128K
Speed
Fast
Region
China
Latency
Good for quick responses
Billable input $/1M
$0.96
Billable output $/1M
$3.84
Chinaenterprisecoding
reasoningcodingfastjson

ByteDance

Doubao Seed

doubao-seed

New

ByteDance model family route for conversational, multimodal and consumer-facing workloads at scale.

Context
128K
Speed
Fast
Region
China
Latency
Good for consumer apps
Billable input $/1M
$0.84
Billable output $/1M
$3.36
Chinamultimodalconsumer
fastmultilingualvisionjson

Baidu

ERNIE 4 Turbo

ernie-4-turbo

Baidu's enterprise model line is useful for Chinese-language knowledge, search-adjacent and business workflows.

Context
128K
Speed
Balanced
Region
China
Latency
Good for knowledge tasks
Billable input $/1M
$1.08
Billable output $/1M
$4.32
Chinaenterpriseknowledge
reasoningmultilingualjson

Tencent Cloud

Hunyuan TurboS

hunyuan-turbos

Hot

Tencent Hunyuan route for Chinese content creation, logic reasoning, code generation and multi-turn dialogue.

Context
128K
Speed
Fast
Region
China
Latency
Good for interactive products
Billable input $/1M
$0.96
Billable output $/1M
$3.84
ChinaTencentOpenAI compatiblereasoning
reasoningcodingtool callingjsonfast

MiniMax

MiniMax M2.7

minimax-m2

Hot

MiniMax M2.7-style route for agentic coding, long-running developer workflows and multimodal platform evaluation.

Context
128K
Speed
Balanced
Region
China
Latency
Good for agent tasks
Billable input $/1M
$0.36
Billable output $/1M
$1.44
Chinaagentic codingdeveloper planlow cost
reasoningcodingtool callingjsonlow cost

StepFun

Step-3

step-3

New

StepFun model route for developer agents, code tools and cost-controlled high-frequency usage through Step-compatible APIs.

Context
128K
Speed
Balanced
Region
China
Latency
Good for coding tools
Billable input $/1M
$0.6
Billable output $/1M
$2.4
ChinacodingagentsStepFun
reasoningcodingtool callinglong contextjson

Baichuan AI

Baichuan4 Turbo

baichuan4-turbo

Baichuan route for Chinese enterprise assistants, healthcare-adjacent knowledge workflows and bilingual business tasks.

Context
32K
Speed
Balanced
Region
China
Latency
Good for domain QA
Billable input $/1M
$1.8
Billable output $/1M
$1.8
ChinaenterpriseknowledgeBaichuan
reasoningjsonmultilingual

iFLYTEK

Spark X1

spark-x1

iFLYTEK Spark route for deep reasoning, Chinese-language productivity, education, speech-adjacent and enterprise scenarios.

Context
64K
Speed
Balanced
Region
China
Latency
Good for reasoning workflows
Billable input $/1M
$0.84
Billable output $/1M
$3.36
ChinaSparkreasoningeducation
reasoningcodingmultilingualtool calling

SenseTime

SenseNova V6

sensenova-v6

SenseNova route for multimodal reasoning, enterprise visual workflows and text-generation scenarios from SenseTime.

Context
128K
Speed
Balanced
Region
China
Latency
Good for multimodal evaluation
Billable input $/1M
$0.96
Billable output $/1M
$3.84
ChinamultimodalvisionSenseTime
reasoningvisiontool callingjsonmultilingual

Huawei Cloud

Pangu NLP

pangu-nlp

Huawei Pangu route for enterprise NLP, industry model applications and Huawei Cloud ModelArts-style deployments.

Context
32K
Speed
Enterprise
Region
China
Latency
Best for private or enterprise routes
Billable input $/1M
$1.2
Billable output $/1M
$4.8
ChinaHuaweiindustryenterprise
reasoningjsonmultilingual

Qihoo 360

360 Zhinao

360zhinao2-o1

360 Zhinao route for Chinese reasoning, security-adjacent assistants and long-context experimentation when configured upstream.

Context
32K
Speed
Balanced
Region
China
Latency
Good for Chinese assistants
Billable input $/1M
$0.48
Billable output $/1M
$1.2
China360reasoningsecurity
reasoningjsonlong context

01.AI

Yi Large

yi-large

01.AI Yi route for bilingual generation, structured outputs and legacy Yi-family compatibility through supported upstream providers.

Context
32K
Speed
Balanced
Region
China
Latency
Good for bilingual text
Billable input $/1M
$1.08
Billable output $/1M
$1.08
Chinabilingualopen ecosystemYi
reasoningcodingjsonmultilingual

Shanghai AI Laboratory

InternLM3

internlm3

InternLM route for open-source Chinese model evaluation, self-hosted deployments and research-friendly application testing.

Context
200K
Speed
Self-hosted
Region
China
Latency
Depends on deployment
Billable input $/1M
$0.24
Billable output $/1M
$0.96
Chinaopen sourceresearchlong context
reasoningcodingjsonlong contextlow cost

Meituan

LongCat Flash

longcat-flash

New

Meituan LongCat route for open multimodal and reasoning model evaluation, including Flash chat and thinking-style variants.

Context
128K
Speed
Fast
Region
China
Latency
Good for high-throughput tests
Billable input $/1M
$0.48
Billable output $/1M
$1.92
ChinaMeituanopen sourcemultimodal
reasoningcodingvisiontool callingfast

Need another model? Submit a request and operations can evaluate the route.

Submit request
Get API Key