Listă scurtă de modele

Cele mai bune modele API LLM ieftine pentru produse sensibile la cost

Compara modele API LLM low-cost dupa pretul de input, pretul de output, context, capability, sursa si potrivire pentru productie.

Vezi modele Estimează costul

La ce folosește această listă scurtă?

Selectia unui API LLM ieftin ar trebui sa inceapa de la forma workloadului, nu doar de la cel mai mic pret afisat. Pentru clasificare, rezumate, routing, drafturi de suport si transformari batch, un model mai ieftin poate reduce cheltuiala lunara fara sa schimbe interfata aplicatiei. Pentru raspunsuri finale, rationament complex sau agenti de cod, echipele ar trebui sa compare un model ieftin cu un fallback mai puternic. NextModel tine la un loc pretul, contextul, capability-urile, sursa providerului si exemplele de cod inainte de productie.

Baza sursei: Catalogul curatat NextModel, preturi publice ale providerilor si metadate OpenRouter cand sunt disponibile.

Blended price

Candidați recomandați api llm ieftin

Pornește de la lista scurtă, testează prompturi reale și compară costul lunar înainte de routingul de producție.

DeepSeekCatalog

DeepSeek: DeepSeek V4 Flash

DeepSeek V4 Flash is an efficiency-optimized Mixture-of-Experts model from DeepSeek with 284B total parameters and 13B activated parameters, supporting a 1M-token context window. It is designed for fast inference and...

$0.112 / 1M tokensInput$0.224 / 1M tokensOutput1MContext

Best forlow-cost Chinese tasks, long-context summary, batch code assistance

RoutingConfigured

Tool callingJSON modeLong contextReasoningLow cost

OpenRouter if availableOpenRouter public Models API live metadata; public price comes from the registry pricing rule

View details

Mistral AICatalog

Mistral: Mistral Small 3.2 24B

Mistral-Small-3.2-24B-Instruct-2506 is an updated 24B parameter model from Mistral optimized for instruction following, repetition reduction, and improved function calling. Compared to the 3.1 release, version 3.2 significantly improves accuracy on...

$0.1 / 1M tokensInput$0.3 / 1M tokensOutput128kContext

Best fortranslation, classification, short-form summarization

RoutingConfigured

Tool callingJSON modeStreamingLow costVisionLong context

OpenRouter if availableOpenRouter public Models API live metadata; public price comes from the registry pricing rule

View details

OpenRouterCatalog

OpenAI: GPT-4o-mini

GPT-4o mini is OpenAI's newest model after [GPT-4 Omni](/models/openai/gpt-4o), supporting both text and image inputs with text outputs. As their most advanced small model, it is many multiples more affordable...

$0.15 / 1M tokensInput$0.6 / 1M tokensOutput128kContext

Best forlow-cost chat, image understanding, classification

RoutingConfigured

Tool callingVisionJSON modeLong contextStreamingLow cost

OpenRouter if availableOpenRouter public Models API live metadata; public price comes from the registry pricing rule

View details

MetaCatalog

Meta: Llama 4 Maverick

Llama 4 Maverick 17B Instruct (128E) is a high-capacity multimodal language model from Meta, built on a mixture-of-experts (MoE) architecture with 128 experts and 17 billion active parameters per forward...

$0.15 / 1M tokensInput$0.6 / 1M tokensOutput1MContext

Best foropen-model workflows, cost-sensitive long context, classification

RoutingConfigured

JSON modeLong contextStreamingLow costTool callingVision

OpenRouter if availableOpenRouter public Models API live metadata; public price comes from the registry pricing rule

View details

Tabel comparativ

Compară lista scurtă după preț, furnizor, context, capacități și sursă.

Folosește această vedere când restrângi o shortlist de producție, construiești o politică de fallback sau compari economia modelelor.

Model	Provider	Input	Output	Context	Capabilities	Best for	Latency	Status	Source
DeepSeek: DeepSeek V4 Flashdeepseek/deepseek-v4-flash	DeepSeek	$0.112 / 1M tokens	$0.224 / 1M tokens	1M	Tool callingJSON modeLong contextReasoning	low-cost Chinese tasks, long-context summary	800-2600ms	Catalog	OpenRouter if available
Mistral: Mistral Small 3.2 24Bmistralai/mistral-small-3.2-24b-instruct	Mistral AI	$0.1 / 1M tokens	$0.3 / 1M tokens	128k	Tool callingJSON modeStreamingLow cost	translation, classification	700-2300ms	Catalog	OpenRouter if available
OpenAI: GPT-4o-miniopenai/gpt-4o-mini	OpenRouter	$0.15 / 1M tokens	$0.6 / 1M tokens	128k	Tool callingVisionJSON modeLong context	low-cost chat, image understanding	800-2400ms	Catalog	OpenRouter if available
Meta: Llama 4 Maverickmeta-llama/llama-4-maverick	Meta	$0.15 / 1M tokens	$0.6 / 1M tokens	1M	JSON modeLong contextStreamingLow cost	open-model workflows, cost-sensitive long context	950-2800ms	Catalog	OpenRouter if available
Google: Gemini 2.5 Flashgoogle/gemini-2.5-flash	Google	$0.3 / 1M tokens	$2.50 / 1M tokens	1M	Tool callingVisionJSON modeLong context	long-document summarization, image Q&A	900-2800ms	Catalog	OpenRouter if available
MoonshotAI: Kimi K2.6moonshotai/kimi-k2.6	Moonshot AI	$0.73 / 1M tokens	$3.49 / 1M tokens	262.1k	JSON modeLong contextStreamingTool calling	long Chinese documents, contract review	1400-4400ms	Catalog	OpenRouter if available

FAQ

API LLM ieftin FAQ

Care este cel mai ieftin model din acest catalog?

Depinde de cursul valutar si de lungimea outputului. Doubao Seed 2.0 Mini ramane cea mai ieftina optiune de productie in CNY din acest catalog.

Ar trebui echipele sa aleaga mereu cel mai ieftin API LLM?

Nu. Modelele ieftine merg bine pentru munca repetitiva si cu risc redus; pentru raspunsuri finale, rationament complex si agenti de cod trebuie comparate cu modele mai puternice.

Toate modelele Calculator de preț Pornire rapidă compatibilă OpenAI