Pay as you go
Model ratesStart with per-model input and output token pricing.
- No large upfront contract
- Estimate before launch
- Use OpenAI-compatible requests
Compare model rates, estimate token cost, choose pay-as-you-go or credits, and add team governance when usage grows.
Pay as you go
Model ratesStart with per-model input and output token pricing.
Credits
Prepaid balanceKeep spend predictable for experiments and small teams.
Team
Governed usageManage projects, keys, budgets, and model policy for production teams.
BYOK
Use your keysBring existing provider accounts into one comparison and governance layer.
Enterprise
CustomPrivate commercial terms for high-volume or governed workloads.
Calculator
Use this as a pre-production estimate. Final billing should be reconciled against provider usage and platform usage records.
Cost = requests x ((input tokens x input price) + (output tokens x output price)) / 1,000,000.
The default Doubao Seed 2.0 Mini estimate for 1M input and 1M output is ¥2.20.
Estimate monthly spend from model price, tokens, and request volume.
Low-cost reference
Price is only one dimension. Review context length, capabilities, source labels, and intended use cases before production use.
| Model | Provider | Input | Output | Context | Capabilities | Best for | Latency | Status | Source |
|---|---|---|---|---|---|---|---|---|---|
| Doubao Seed 2.0 Minidoubao-seed-2-0-mini | Volcengine | ¥0.2 / 1M tokens | ¥2 / 1M tokens | 128k | Tool callingVisionJSON modeLong context | Chinese Q&A, low-cost general chat | 900-2600ms | Production | Platform curated |
| DeepSeek: DeepSeek V4 Flashdeepseek/deepseek-v4-flash | DeepSeek | $0.112 / 1M tokens | $0.224 / 1M tokens | 1M | Tool callingJSON modeLong contextReasoning | low-cost Chinese tasks, long-context summary | 800-2600ms | Catalog | OpenRouter if available |
| Mistral: Mistral Small 3.2 24Bmistralai/mistral-small-3.2-24b-instruct | Mistral AI | $0.1 / 1M tokens | $0.3 / 1M tokens | 128k | Tool callingJSON modeStreamingLow cost | translation, classification | 700-2300ms | Catalog | OpenRouter if available |
| OpenAI: GPT-4o-miniopenai/gpt-4o-mini | OpenRouter | $0.15 / 1M tokens | $0.6 / 1M tokens | 128k | Tool callingVisionJSON modeLong context | low-cost chat, image understanding | 800-2400ms | Catalog | OpenRouter if available |
| Meta: Llama 4 Maverickmeta-llama/llama-4-maverick | Meta | $0.15 / 1M tokens | $0.6 / 1M tokens | 1M | JSON modeLong contextStreamingLow cost | open-model workflows, cost-sensitive long context | 950-2800ms | Catalog | OpenRouter if available |
| Google: Gemini 2.5 Flashgoogle/gemini-2.5-flash | $0.3 / 1M tokens | $2.50 / 1M tokens | 1M | Tool callingVisionJSON modeLong context | long-document summarization, image Q&A | 900-2800ms | Catalog | OpenRouter if available | |
| DeepSeek: R1deepseek/deepseek-r1 | DeepSeek | $0.7 / 1M tokens | $2.50 / 1M tokens | 163.8k | JSON modeLong contextReasoningStreaming | Chinese reasoning, math | 1800-6000ms | Catalog | OpenRouter if available |
| Qwen: Qwen3 Coder Plusqwen/qwen3-coder-plus | Alibaba Cloud / Qwen | $0.65 / 1M tokens | $3.25 / 1M tokens | 1M | Tool callingJSON modeLong contextStreaming | Chinese engineering workflows, code generation | 1200-3900ms | Catalog | OpenRouter if available |
FAQ
The calculator multiplies input tokens and output tokens by the selected model price per 1M tokens, then multiplies by request count.
Yes. ¥0.20 input plus ¥2.00 output equals ¥2.20 for that single 1M + 1M request estimate.
Yes. The BYOK plan is designed for teams that already have provider accounts and want consistent policy and usage reporting.
Enterprise pricing can be negotiated around volume, provider mix, region, support requirements, and governance needs.