Loading...Working on your request
模型候選名單

適合台灣團隊的低成本 LLM API 模型

依輸入價格、輸出價格、上下文長度、能力、來源與正式環境適配度,為台灣團隊比較低成本 LLM API 模型。

這份候選名單適合什麼用途?

Cheap LLM API selection should start with workload shape, not only the lowest posted rate. For classification, summarization, routing, support drafts, and batch transformations, a lower-cost model can reduce monthly spend without changing your application interface. For final answers, complex reasoning, or coding agents, teams should benchmark a low-cost model against a stronger fallback. NextModel keeps price, context, capability, provider source, and code examples in one place so developers can make that tradeoff before deployment.

來源依據: NextModel curated catalog, provider public pricing, and OpenRouter metadata when available.

综合价格

推薦候選 cheap llm api

先從候選名單開始,再以真實提示詞測試,並在接入正式環境路由前比較月度成本。

DeepSeekCatalog

DeepSeek V4 Flash is an efficiency-optimized Mixture-of-Experts model from DeepSeek with 284B total parameters and 13B activated parameters, supporting a 1M-token context window. It is designed for fast inference and...

$0.112 / 1M tokensInput$0.224 / 1M tokensOutput1MContext
Best forlow-cost Chinese tasks, long-context summary, batch code assistance
RoutingConfigured
Tool callingJSON modeLong contextReasoningLow cost
OpenRouter if availableOpenRouter public Models API live metadata; public price comes from the registry pricing rule
View details
Mistral AICatalog

Mistral-Small-3.2-24B-Instruct-2506 is an updated 24B parameter model from Mistral optimized for instruction following, repetition reduction, and improved function calling. Compared to the 3.1 release, version 3.2 significantly improves accuracy on...

$0.1 / 1M tokensInput$0.3 / 1M tokensOutput128kContext
Best fortranslation, classification, short-form summarization
RoutingConfigured
Tool callingJSON modeStreamingLow costVisionLong context
OpenRouter if availableOpenRouter public Models API live metadata; public price comes from the registry pricing rule
View details
OpenRouterCatalog

GPT-4o mini is OpenAI's newest model after [GPT-4 Omni](/models/openai/gpt-4o), supporting both text and image inputs with text outputs. As their most advanced small model, it is many multiples more affordable...

$0.15 / 1M tokensInput$0.6 / 1M tokensOutput128kContext
Best forlow-cost chat, image understanding, classification
RoutingConfigured
Tool callingVisionJSON modeLong contextStreamingLow cost
OpenRouter if availableOpenRouter public Models API live metadata; public price comes from the registry pricing rule
View details
MetaCatalog

Llama 4 Maverick 17B Instruct (128E) is a high-capacity multimodal language model from Meta, built on a mixture-of-experts (MoE) architecture with 128 experts and 17 billion active parameters per forward...

$0.15 / 1M tokensInput$0.6 / 1M tokensOutput1MContext
Best foropen-model workflows, cost-sensitive long context, classification
RoutingConfigured
JSON modeLong contextStreamingLow costTool callingVision
OpenRouter if availableOpenRouter public Models API live metadata; public price comes from the registry pricing rule
View details

比較表

按價格、提供者、上下文、能力與來源比較這份候選名單。

當你在縮小正式環境候選名單、建立備援策略或比較模型經濟性時,可使用此視圖。

ModelProviderInputOutputContextCapabilitiesBest forLatencyStatusSource
DeepSeek: DeepSeek V4 Flashdeepseek/deepseek-v4-flashDeepSeek$0.112 / 1M tokens$0.224 / 1M tokens1M
Tool callingJSON modeLong contextReasoning
low-cost Chinese tasks, long-context summary800-2600msCatalogOpenRouter if available
Mistral: Mistral Small 3.2 24Bmistralai/mistral-small-3.2-24b-instructMistral AI$0.1 / 1M tokens$0.3 / 1M tokens128k
Tool callingJSON modeStreamingLow cost
translation, classification700-2300msCatalogOpenRouter if available
OpenAI: GPT-4o-miniopenai/gpt-4o-miniOpenRouter$0.15 / 1M tokens$0.6 / 1M tokens128k
Tool callingVisionJSON modeLong context
low-cost chat, image understanding800-2400msCatalogOpenRouter if available
Meta: Llama 4 Maverickmeta-llama/llama-4-maverickMeta$0.15 / 1M tokens$0.6 / 1M tokens1M
JSON modeLong contextStreamingLow cost
open-model workflows, cost-sensitive long context950-2800msCatalogOpenRouter if available
Google: Gemini 2.5 Flashgoogle/gemini-2.5-flashGoogle$0.3 / 1M tokens$2.50 / 1M tokens1M
Tool callingVisionJSON modeLong context
long-document summarization, image Q&A900-2800msCatalogOpenRouter if available
MoonshotAI: Kimi K2.6moonshotai/kimi-k2.6Moonshot AI$0.73 / 1M tokens$3.49 / 1M tokens262.1k
JSON modeLong contextStreamingTool calling
long Chinese documents, contract review1400-4400msCatalogOpenRouter if available

常見問題

Cheap LLM API 常見問題

What is the cheapest model in this catalog?

The cheapest option depends on currency conversion and output length. Doubao Seed 2.0 Mini is the lowest-cost CNY production entry in this catalog.

Should teams always pick the cheapest LLM API?

No. Use cheap models for repeatable low-risk work, then compare quality against stronger models for final answers, complex reasoning, and coding agents.