Meta model

Llama 4 Maverick

Llama 4 Maverick 17B Instruct (128E) is a high-capacity multimodal language model from Meta, built on a mixture-of-experts (MoE) architecture with 128 experts and 17 billion active parameters per forward...

MetaOpenRouter if availableCatalog
JSON modeLong contextStreamingLow costVision
Input price$0.15 / 1M tokens
Output price$0.6 / 1M tokens
Context length1M tokens
AvailabilityCatalog

Best use cases

  • open-model workflows
  • cost-sensitive long context
  • classification

OpenAI-compatible code example

Keep the OpenAI SDK style, set base_url to NextModel, and use the catalog model ID llama-4-maverick.

Python
from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://api.nextmodel.app/v1"
)

resp = client.chat.completions.create(
    model="llama-4-maverick",
    messages=[{"role": "user", "content": "Hello from NextModel"}]
)

print(resp.choices[0].message.content)

Similar alternatives

GoogleCatalog

Gemini 2.5 Flash is a lower-cost long-context and vision candidate for teams that need multimodal coverage without always using a premium model.

$0.3 / 1M tokensInput$2.50 / 1M tokensOutput1MContext
Best forlong-document summarization, image Q&A, fast multimodal routing
Routingconfigured
Tool callingVisionJSON modeLong context
OpenRouter if availableOpenRouter public Models API when available; curated fallback otherwise
View details
OpenRouterCatalog

GPT-4o mini is a mature low-cost multimodal option for teams that already use OpenAI-compatible SDKs and need a balanced default model for product workflows.

$0.15 / 1M tokensInput$0.6 / 1M tokensOutput128kContext
Best forlow-cost chat, image understanding, classification
Routingconfigured
Tool callingVisionJSON modeLong context
OpenRouter if availableOpenRouter public Models API when available; curated fallback otherwise
View details
Moonshot AICatalog

Kimi K2.6 is a long-context Chinese model candidate for document-heavy teams comparing cost, context length, and domestic model coverage.

$0.73 / 1M tokensInput$3.49 / 1M tokensOutput1MContext
Best forlong Chinese documents, contract review, knowledge-base Q&A
Routingconfigured
JSON modeLong contextStreaming
OpenRouter if availableOpenRouter public Models API when available; curated fallback otherwise
View details

FAQ

Meta: Llama 4 Maverick API questions

Why include Llama 4 Maverick?

It gives teams an open-model candidate when comparing cost, context length, and provider optionality.