Pay as you go
Model ratesStart with per-model input and output token pricing.
- No large upfront contract
- Estimate before launch
- Use OpenAI-compatible requests
If you're comparing models for a live product in Australia, use the calculator first, then choose the plan that matches your spend pattern.
Pay as you go
Model ratesStart with per-model input and output token pricing.
Credits
Prepaid balanceKeep spend predictable for experiments and small teams.
Team
Governed usageManage projects, keys, budgets, and model policy for production teams.
BYOK
Use your keysBring existing provider accounts into one comparison and governance layer.
Enterprise
CustomPrivate commercial terms for high-volume or governed workloads.
Calculator
Use this as a pre-production estimate. Final billing should be reconciled against provider usage and platform usage records.
Cost = requests x ((input tokens x input price) + (output tokens x output price)) / 1,000,000.
The default Doubao Seed 2.0 Mini estimate for 1M input and 1M output is ¥2.20.
Estimate monthly spend from model price, tokens, and request volume before Australian production rollout.
AI API cost is estimated by multiplying request count by input tokens and output tokens, then applying each model's posted price per 1M tokens. Teams should calculate a low-cost model, a quality fallback, and an expected monthly request volume before routing production traffic.
Run CacheSafety Bench before enabling a cache policy in production. Bad Hit Rate matters more than raw hit rate.
Run CacheSafety BenchLow-cost reference
Price is only one dimension. Review context length, capabilities, source labels, and intended use cases before production use.
| Model | Provider | Input | Output | Context | Capabilities | Best for | Latency | Status | Source |
|---|---|---|---|---|---|---|---|---|---|
| Doubao Seed 2.0 Minidoubao-seed-2-0-mini | Volcengine | ¥0.2 / 1M tokens | ¥2 / 1M tokens | 128k | StreamingJSON mode | Coding | 900-2600ms | Catalog | Platform curated |
| DeepSeek: DeepSeek V4 Flashdeepseek/deepseek-v4-flash | DeepSeek | $0.112 / 1M tokens | $0.224 / 1M tokens | 1M | Tool callingJSON modeLong contextReasoning | low-cost Chinese tasks, long-context summary | 800-2600ms | Catalog | OpenRouter if available |
| Mistral: Mistral Small 3.2 24Bmistralai/mistral-small-3.2-24b-instruct | Mistral AI | $0.1 / 1M tokens | $0.3 / 1M tokens | 128k | Tool callingJSON modeStreamingLow cost | translation, classification | 700-2300ms | Catalog | OpenRouter if available |
| OpenAI: GPT-4o-miniopenai/gpt-4o-mini | OpenRouter | $0.15 / 1M tokens | $0.6 / 1M tokens | 128k | Tool callingVisionJSON modeLong context | low-cost chat, image understanding | 800-2400ms | Catalog | OpenRouter if available |
| Meta: Llama 4 Maverickmeta-llama/llama-4-maverick | Meta | $0.15 / 1M tokens | $0.6 / 1M tokens | 1M | JSON modeLong contextStreamingLow cost | open-model workflows, cost-sensitive long context | 950-2800ms | Catalog | OpenRouter if available |
| Google: Gemini 2.5 Flashgoogle/gemini-2.5-flash | $0.3 / 1M tokens | $2.50 / 1M tokens | 1M | Tool callingVisionJSON modeLong context | long-document summarization, image Q&A | 900-2800ms | Catalog | OpenRouter if available | |
| DeepSeek: R1deepseek/deepseek-r1 | DeepSeek | $0.7 / 1M tokens | $2.50 / 1M tokens | 163.8k | JSON modeLong contextReasoningStreaming | Chinese reasoning, math | 1800-6000ms | Catalog | OpenRouter if available |
| Qwen: Qwen3 Coder Plusqwen/qwen3-coder-plus | Alibaba Cloud / Qwen | $0.65 / 1M tokens | $3.25 / 1M tokens | 1M | Tool callingJSON modeLong contextStreaming | Chinese engineering workflows, code generation | 1200-3900ms | Catalog | OpenRouter if available |
FAQ
The calculator multiplies input tokens and output tokens by the selected model price per 1M tokens, then multiplies by request count.
Yes. ¥0.20 input plus ¥2.00 output equals ¥2.20 for that single 1M + 1M request estimate.
Yes. The BYOK plan is designed for teams that already have provider accounts and want consistent policy and usage reporting.
Enterprise pricing can be negotiated around volume, provider mix, region, support requirements, and governance needs.