Kimi API

Kimi API Access

Access Kimi family models for long-context reading, agent planning, coding and multimodal-ready workflows through one compatible API.

Long-context workspaces

Agentic coding flows

OpenAI SDK migration

Budget controls per key

Get API Key Open in Playground View Docs

5 min

to first call

USD

token pricing

256K

context window

SSE

streaming

Kimi long-context route

OpenAI-compatible model gateway

256K context family

Client

</>

Your App

SDK, backend, agent or workflow.

SmarToken

Unified API Gateway

kimi-k2

AuthBudgetRoute

SSE stream

Provider

Kimi K2

Moonshot AI route via admin pool.

Your app

Send a standard Chat Completions request with your SmarToken key.

SmarToken gateway

Validate the key, model scope, daily budget and monthly budget.

Model route pool

Choose an enabled upstream route by priority, weight and fallback.

Moonshot AI

Call kimi-k2, stream the response and record usage.

Response Preview

200 OK

{
  "model": "kimi-k2",
  "choices": [{
    "message": {
      "role": "assistant",
      "content": "Kimi is useful when your app needs to read longer documents, repositories and research material before answering."
    }
  }],
  "usage": { "tracked": true, "currency": "USD" }
}

Fast Deployment

Generate a key and run the Playground in minutes.

Global Access

Use one endpoint for configured China-first routes.

Unified Billing

Track USD token usage and wallet debits together.

Easy Migration

Keep OpenAI SDK shape, change baseURL and model.

Developer Friendly

Copy-ready cURL, Python and TypeScript examples.

Model fit

Why use Kimi?

A focused page for evaluating this model as an API route, not just reading a catalog row.

Long context

A strong page fit for document Q&A, repository analysis and multi-step planning.

Agent workflows

Good for teams prototyping research agents and coding agents that need memory over a larger prompt.

Developer migration

Works behind the same Chat Completions interface used by the rest of SmarToken.

Route fit

Kimi route-fit checks

Treat kimi-k2 as a route to test against a workload, not just a catalog name.

When Kimi is a strong fit

Kimi K2 is strongest for Long context, agents, research. Validate it with real prompts before making it a default route.

When to compare alternatives

If your workload depends more on long context, structured output or lowest cost, compare Kimi K2 with neighboring model families.

Pre-production check

Long prompts still need careful chunking and cost monitoring.

Why SmarToken

Why not direct vendor accounts?

Direct accounts can work once region, billing, model IDs and credentials are settled. SmarToken is built for faster overseas evaluation and safer early production.

Unified key

One console key reaches mainstream Chinese model routes including DeepSeek, Kimi, Qwen, Hunyuan, MiniMax and Spark.

English docs

Model IDs, SDK samples and error semantics are written for overseas teams.

Budget control

Daily, monthly and model-family limits keep experiments predictable.

Route visibility

Usage logs connect model, API key, token estimate, latency and wallet debit.

Code sample

Copy-ready API examples

Open in Playground

curl

curl -s "https://thesmartoken.com/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_SMARTOKEN_KEY" \
  -d '{
    "model": "kimi-k2",
    "stream": true,
    "messages": [
      { "role": "user", "content": "Explain the best use case for this model." }
    ]
  }'

python

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_SMARTOKEN_KEY",
    base_url="https://thesmartoken.com/v1",
)

stream = client.chat.completions.create(
    model="kimi-k2",
    stream=True,
    messages=[{"role": "user", "content": "Explain the best use case for this model."}],
)

for chunk in stream:
    print(chunk.choices[0].delta.content or "", end="")

typescript

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.SMTOKEN_API_KEY,
  baseURL: "https://thesmartoken.com/v1",
});

const stream = await client.chat.completions.create({
  model: "kimi-k2",
  stream: true,
  messages: [{ role: "user", content: "Explain the best use case for this model." }],
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content ?? "");
}

Comparison

How Kimi compares

Position Chinese model families by context, pricing and workload fit before you lock in a default route.

Model	Context	Input / 1M	Output / 1M	Reasoning	Coding	Best for
DeepSeek	128K	$0.50	$1.50	5/5	5/5	Reasoning, coding, research
Kimi	256K	$0.60	$2.50	4/5	4/5	Long context, agents, research
Qwen	128K	$1.20	$4.80	4/5	4/5	Multilingual apps, structured output
GLM	128K	$0.80	$3.20	4/5	4/5	Enterprise tools, business workflows
Doubao	128K	$0.70	$2.80	3/5	3/5	Consumer chat, content workflows
ERNIE	128K	$0.90	$3.60	4/5	3/5	Chinese knowledge, enterprise search
Hunyuan	128K	$0.80	$3.20	4/5	4/5	Enterprise assistants, Tencent Cloud routes
MiniMax	128K	$0.30	$1.20	4/5	5/5	Coding agents, developer automation
StepFun	128K	$0.50	$2.00	4/5	4/5	Coding tools, planning agents
Baichuan	32K	$1.50	$1.50	4/5	3/5	Domain QA, bilingual writing
Spark	64K	$0.70	$2.80	4/5	4/5	Chinese reasoning, education
SenseNova	128K	$0.80	$3.20	4/5	3/5	Multimodal enterprise workflows
Pangu	32K	$1.00	$4.00	4/5	3/5	Industry NLP, private routes
360 Zhinao	32K	$0.40	$1.00	4/5	3/5	Security assistants, Chinese QA
Yi	32K	$0.90	$0.90	4/5	4/5	Bilingual writing, structured output
InternLM	200K	$0.20	$0.80	4/5	4/5	Self-hosted evals, research
LongCat	128K	$0.40	$1.60	4/5	4/5	Open model evals, multimodal prototypes

How do I call Kimi with the OpenAI SDK?

Use baseURL https://thesmartoken.com/v1, authenticate with a SmarToken key, and send model: kimi-k2.

When should I choose Kimi over DeepSeek or Qwen?

Choose Kimi when long-context reading, planning and document-heavy workflows matter more than the lowest per-token price.

Use cases

Popular use cases

Start with a clear workload, then compare routes in Playground before moving traffic into production.

Document Q&A

Summarize and query long reports, contracts and technical references.

Codebase reading

Use longer context for repository maps, migration notes and issue triage.

Research agent

Plan multi-step answers from larger source packets.

Customer analysis

Condense transcripts, tickets and user feedback into action lists.

Migration

From your current AI gateway

The integration path stays familiar: same Chat Completions shape, one new baseURL and a China-first model ID.

Switch path

From OpenAI

Keep the SDK. Change baseURL to SmarToken and switch the model ID.

Switch path

From OpenRouter

Move from broad routing into a focused Chinese-model console with budgets.

Switch path

From Together.ai

Keep Chat Completions while adding China-first model pages and route control.

FAQ

Questions developers ask

How do I call Kimi with the OpenAI SDK?

Use the standard OpenAI SDK, set baseURL to https://thesmartoken.com/v1, authenticate with your SmarToken key, and pass model: kimi-k2.

What is Kimi best for?

Kimi is a good candidate for long-context reading, research agents and codebase analysis. Test it in the Playground before routing production traffic.

How does Kimi compare with DeepSeek and Qwen?

Use the comparison table as a starting point, then run your own prompts because coding, reasoning, latency and cost can vary by upstream route.

Is this API compatible with the OpenAI SDK?

Yes. Use your SmarToken key, set baseURL to https://thesmartoken.com/v1, and pass the model ID shown on this page.

How is billing calculated?

Input and output tokens are priced separately in USD per 1M tokens. Final usage is recorded after the provider returns a response or a stream finishes.

Where do I get an API key?

Create an account, open the console, generate an API key, then copy the cURL, Python or TypeScript example from the Playground.

Can I restrict one key to this model family?

Yes. Console API keys can be limited by allowed model family plus daily and monthly USD budgets.

Pricing

Kimi K2

USD

Input tokens: $0.72; per 1M input tokens · catalog $0.60
Output tokens: $3.00; per 1M output tokens · catalog $2.50
Platform fee: 20%; Included in billable token prices.
Context: 256K
Speed: Balanced

Long contextCodingTool callingStreamingResearch

Get API Key Now

Model facts

API model ID: kimi-k2
Vendor: Moonshot AI
Region: China
Latency: Long-context
Last reviewed: 2026-05-15

Admin route pool

Bind a site model ID to an upstream model ID.
Choose OPENAI or CUSTOM provider keys.
Set priority, weight, enabled state and fallback notes.
Use budgeted API keys to keep vendor secrets isolated.

Sources

Kimi API Docsofficial
OpenCompass benchmark hubbenchmark

Limitations

- Long prompts still need careful chunking and cost monitoring.
- Feature availability depends on the upstream Kimi-compatible route configured by admins.

Benchmark notes

- Benchmark and community results should be treated as guidance, not a replacement for app-specific evals.
- Compare Kimi against Qwen on multilingual and coding tasks when selecting a default model.

Start now

Start building with Chinese AI models in minutes

One smart key for DeepSeek, Kimi, Qwen, GLM, Doubao, ERNIE and other configured routes.

Get API Key Now View Documentation