SEO 模型榜单

Best cheap LLM API models for cost-sensitive products

Compare low-cost LLM API models by input price, output price, context length, capability, source, and production fit.

Cheap LLM API selection should start with workload shape, not only the lowest posted rate. For classification, summarization, routing, support drafts, and batch transformations, a lower-cost model can reduce monthly spend without changing your application interface. For final answers, complex reasoning, or coding agents, teams should benchmark a low-cost model against a stronger fallback. NextModel keeps price, context, capability, provider source, and code examples in one place so developers can make that tradeoff before deployment.

来源基础:NextModel curated catalog, provider public pricing, and OpenRouter metadata when available.

Blended price

推荐的 cheap llm api 候选

先从短名单开始,再用真实提示词和月度成本估算做生产前验证。

VolcengineProduction

Doubao Seed 2.0 Mini is the lowest-cost production model currently exposed through the NextModel public gateway. It is a practical default for Chinese Q&A, classification, summarization, and lightweight multimodal tasks.

¥0.2 / 1M tokensInput¥2 / 1M tokensOutput128kContext
Best forChinese Q&A, low-cost general chat, multimodal understanding
Routingconfigured
Tool callingVisionJSON modeLong contextStreamingLow cost
Platform curatedNextModel production gateway and Volcengine pricing config
查看详情
DeepSeekCatalog

DeepSeek V4 Flash is an efficiency-optimized Mixture-of-Experts model from DeepSeek with 284B total parameters and 13B activated parameters, supporting a 1M-token context window. It is designed for fast inference and...

$0.112 / 1M tokensInput$0.224 / 1M tokensOutput1MContext
Best forlow-cost Chinese tasks, long-context summary, batch code assistance
Routingconfigured
Tool callingJSON modeLong contextReasoningLow cost
OpenRouter if availableOpenRouter public Models API live metadata; public price comes from the registry pricing rule
查看详情
Mistral AICatalog

Mistral-Small-3.2-24B-Instruct-2506 is an updated 24B parameter model from Mistral optimized for instruction following, repetition reduction, and improved function calling. Compared to the 3.1 release, version 3.2 significantly improves accuracy on...

$0.1 / 1M tokensInput$0.3 / 1M tokensOutput128kContext
Best fortranslation, classification, short-form summarization
Routingconfigured
Tool callingJSON modeStreamingLow costVisionLong context
OpenRouter if availableOpenRouter public Models API live metadata; public price comes from the registry pricing rule
查看详情
OpenRouterCatalog

GPT-4o mini is OpenAI's newest model after [GPT-4 Omni](/models/openai/gpt-4o), supporting both text and image inputs with text outputs. As their most advanced small model, it is many multiples more affordable...

$0.15 / 1M tokensInput$0.6 / 1M tokensOutput128kContext
Best forlow-cost chat, image understanding, classification
Routingconfigured
Tool callingVisionJSON modeLong contextStreamingLow cost
OpenRouter if availableOpenRouter public Models API live metadata; public price comes from the registry pricing rule
查看详情

对比表

按价格、提供方、上下文、能力和来源比较候选列表。

这张表是为搜索访客和开发团队准备的实用决策视图,而不是泛泛的模型名称罗列。

ModelProviderInputOutputContextCapabilitiesBest forLatencyStatusSource
Doubao Seed 2.0 Minidoubao-seed-2-0-miniVolcengine¥0.2 / 1M tokens¥2 / 1M tokens128k
Tool callingVisionJSON modeLong context
Chinese Q&A, low-cost general chat900-2600msProductionPlatform curated
DeepSeek: DeepSeek V4 Flashdeepseek/deepseek-v4-flashDeepSeek$0.112 / 1M tokens$0.224 / 1M tokens1M
Tool callingJSON modeLong contextReasoning
low-cost Chinese tasks, long-context summary800-2600msCatalogOpenRouter if available
Mistral: Mistral Small 3.2 24Bmistralai/mistral-small-3.2-24b-instructMistral AI$0.1 / 1M tokens$0.3 / 1M tokens128k
Tool callingJSON modeStreamingLow cost
translation, classification700-2300msCatalogOpenRouter if available
OpenAI: GPT-4o-miniopenai/gpt-4o-miniOpenRouter$0.15 / 1M tokens$0.6 / 1M tokens128k
Tool callingVisionJSON modeLong context
low-cost chat, image understanding800-2400msCatalogOpenRouter if available
Meta: Llama 4 Maverickmeta-llama/llama-4-maverickMeta$0.15 / 1M tokens$0.6 / 1M tokens1M
JSON modeLong contextStreamingLow cost
open-model workflows, cost-sensitive long context950-2800msCatalogOpenRouter if available
Google: Gemini 2.5 Flashgoogle/gemini-2.5-flashGoogle$0.3 / 1M tokens$2.50 / 1M tokens1M
Tool callingVisionJSON modeLong context
long-document summarization, image Q&A900-2800msCatalogOpenRouter if available
MoonshotAI: Kimi K2.6moonshotai/kimi-k2.6Moonshot AI$0.73 / 1M tokens$3.49 / 1M tokens262.1k
JSON modeLong contextStreamingTool calling
long Chinese documents, contract review1400-4400msCatalogOpenRouter if available

FAQ

Cheap LLM API 常见问题

What is the cheapest model in this catalog?

The cheapest option depends on currency conversion and output length. Doubao Seed 2.0 Mini is the lowest-cost CNY production entry in this catalog.

Should teams always pick the cheapest LLM API?

No. Use cheap models for repeatable low-risk work, then compare quality against stronger models for final answers, complex reasoning, and coding agents.