Doubao Seed 2.0 Mini is the lowest-cost production model currently exposed through the NextModel public gateway. It is a practical default for Chinese Q&A, classification, summarization, and lightweight multimodal tasks.
Cheap LLM API selection should start with workload shape, not only the lowest posted rate. For classification, summarization, routing, support drafts, and batch transformations, a lower-cost model can reduce monthly spend without changing your application interface. For final answers, complex reasoning, or coding agents, teams should benchmark a low-cost model against a stronger fallback. NextModel keeps price, context, capability, provider source, and code examples in one place so developers can make that tradeoff before deployment.
来源基础:NextModel curated catalog, provider public pricing, and OpenRouter metadata when available.
Blended price
推荐的 cheap llm api 候选
先从短名单开始,再用真实提示词和月度成本估算做生产前验证。
DeepSeek V4 Flash is an efficiency-optimized Mixture-of-Experts model from DeepSeek with 284B total parameters and 13B activated parameters, supporting a 1M-token context window. It is designed for fast inference and...
Mistral-Small-3.2-24B-Instruct-2506 is an updated 24B parameter model from Mistral optimized for instruction following, repetition reduction, and improved function calling. Compared to the 3.1 release, version 3.2 significantly improves accuracy on...
GPT-4o mini is OpenAI's newest model after [GPT-4 Omni](/models/openai/gpt-4o), supporting both text and image inputs with text outputs. As their most advanced small model, it is many multiples more affordable...
对比表
按价格、提供方、上下文、能力和来源比较候选列表。
这张表是为搜索访客和开发团队准备的实用决策视图,而不是泛泛的模型名称罗列。
| Model | Provider | Input | Output | Context | Capabilities | Best for | Latency | Status | Source |
|---|---|---|---|---|---|---|---|---|---|
| Doubao Seed 2.0 Minidoubao-seed-2-0-mini | Volcengine | ¥0.2 / 1M tokens | ¥2 / 1M tokens | 128k | Tool callingVisionJSON modeLong context | Chinese Q&A, low-cost general chat | 900-2600ms | Production | Platform curated |
| DeepSeek: DeepSeek V4 Flashdeepseek/deepseek-v4-flash | DeepSeek | $0.112 / 1M tokens | $0.224 / 1M tokens | 1M | Tool callingJSON modeLong contextReasoning | low-cost Chinese tasks, long-context summary | 800-2600ms | Catalog | OpenRouter if available |
| Mistral: Mistral Small 3.2 24Bmistralai/mistral-small-3.2-24b-instruct | Mistral AI | $0.1 / 1M tokens | $0.3 / 1M tokens | 128k | Tool callingJSON modeStreamingLow cost | translation, classification | 700-2300ms | Catalog | OpenRouter if available |
| OpenAI: GPT-4o-miniopenai/gpt-4o-mini | OpenRouter | $0.15 / 1M tokens | $0.6 / 1M tokens | 128k | Tool callingVisionJSON modeLong context | low-cost chat, image understanding | 800-2400ms | Catalog | OpenRouter if available |
| Meta: Llama 4 Maverickmeta-llama/llama-4-maverick | Meta | $0.15 / 1M tokens | $0.6 / 1M tokens | 1M | JSON modeLong contextStreamingLow cost | open-model workflows, cost-sensitive long context | 950-2800ms | Catalog | OpenRouter if available |
| Google: Gemini 2.5 Flashgoogle/gemini-2.5-flash | $0.3 / 1M tokens | $2.50 / 1M tokens | 1M | Tool callingVisionJSON modeLong context | long-document summarization, image Q&A | 900-2800ms | Catalog | OpenRouter if available | |
| MoonshotAI: Kimi K2.6moonshotai/kimi-k2.6 | Moonshot AI | $0.73 / 1M tokens | $3.49 / 1M tokens | 262.1k | JSON modeLong contextStreamingTool calling | long Chinese documents, contract review | 1400-4400ms | Catalog | OpenRouter if available |
FAQ
Cheap LLM API 常见问题
What is the cheapest model in this catalog?
The cheapest option depends on currency conversion and output length. Doubao Seed 2.0 Mini is the lowest-cost CNY production entry in this catalog.
Should teams always pick the cheapest LLM API?
No. Use cheap models for repeatable low-risk work, then compare quality against stronger models for final answers, complex reasoning, and coding agents.