/ models

42 models,
one endpoint.

13 curated records are wired into this public preview, with source labels for platform curated, provider public pricing, and OpenRouter-backed metadata when available.

Models API Estimate cost

13 of 13 models

Model cards with source labels and copyable OpenAI-compatible calls.

AnthropicCatalog

Anthropic: Claude Opus 4.7

Opus 4.7 is the next generation of Anthropic's Opus family, built for long-running, asynchronous agents. Building on the coding and agentic strengths of Opus 4.6, it delivers stronger performance on...

$5 / 1M tokensInput$25 / 1M tokensOutput1MContext

Best forfrontier reasoning, large codebase review, strategy analysis

Routingconfigured

Tool callingJSON modeLong contextReasoningStreamingVision

OpenRouter if availableOpenRouter public Models API live metadata; public price comes from the registry pricing rule

View details

AnthropicCatalog

Anthropic: Claude Sonnet 4.5

Claude Sonnet 4.5 is Anthropic’s most advanced Sonnet model to date, optimized for real-world agents and coding workflows. It delivers state-of-the-art performance on coding benchmarks such as SWE-bench Verified, with...

$3 / 1M tokensInput$15 / 1M tokensOutput1MContext

Best forcoding agents, code review, complex writing

Routingconfigured

Tool callingJSON modeLong contextReasoningStreamingVision

OpenRouter if availableOpenRouter public Models API live metadata; public price comes from the registry pricing rule

View details

GoogleCatalog

Google: Gemini 2.5 Pro

Gemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reasoning, coding, mathematics, and scientific tasks. It employs “thinking” capabilities, enabling it to reason through responses with enhanced accuracy...

$1.25 / 1M tokensInput$10 / 1M tokensOutput1MContext

Best forlong-context analysis, vision workflows, scientific reasoning

Routingconfigured

Tool callingVisionJSON modeLong contextReasoningStreaming

OpenRouter if availableOpenRouter public Models API live metadata; public price comes from the registry pricing rule

View details

DeepSeekCatalog

DeepSeek: R1

DeepSeek R1 is here: Performance on par with [OpenAI o1](/openai/o1), but open-sourced and with fully open reasoning tokens. It's 671B parameters in size, with 37B active in an inference pass....

$0.7 / 1M tokensInput$2.50 / 1M tokensOutput163.8kContext

Best forChinese reasoning, math, analysis

Routingconfigured

JSON modeLong contextReasoningStreamingTool calling

OpenRouter if availableOpenRouter public Models API live metadata; public price comes from the registry pricing rule

View details

VolcengineProduction

Doubao Seed 2.0 Mini

Doubao Seed 2.0 Mini is the lowest-cost production model currently exposed through the NextModel public gateway. It is a practical default for Chinese Q&A, classification, summarization, and lightweight multimodal tasks.

¥0.2 / 1M tokensInput¥2 / 1M tokensOutput128kContext

Best forChinese Q&A, low-cost general chat, multimodal understanding

Routingconfigured

Tool callingVisionJSON modeLong contextStreamingLow cost

Platform curatedNextModel production gateway and Volcengine pricing config

View details

DeepSeekCatalog

DeepSeek: DeepSeek V4 Flash

DeepSeek V4 Flash is an efficiency-optimized Mixture-of-Experts model from DeepSeek with 284B total parameters and 13B activated parameters, supporting a 1M-token context window. It is designed for fast inference and...

$0.112 / 1M tokensInput$0.224 / 1M tokensOutput1MContext

Best forlow-cost Chinese tasks, long-context summary, batch code assistance

Routingconfigured

Tool callingJSON modeLong contextReasoningLow cost

OpenRouter if availableOpenRouter public Models API live metadata; public price comes from the registry pricing rule

View details

Alibaba Cloud / QwenCatalog

Qwen: Qwen3 Coder Plus

Qwen3 Coder Plus is Alibaba's proprietary version of the Open Source Qwen3 Coder 480B A35B. It is a powerful coding agent model specializing in autonomous programming via tool calling and...

$0.65 / 1M tokensInput$3.25 / 1M tokensOutput1MContext

Best forChinese engineering workflows, code generation, codebase Q&A

Routingconfigured

Tool callingJSON modeLong contextStreaming

OpenRouter if availableOpenRouter public Models API live metadata; public price comes from the registry pricing rule

View details

GoogleCatalog

Google: Gemini 2.5 Flash

Gemini 2.5 Flash is Google's state-of-the-art workhorse model, specifically designed for advanced reasoning, coding, mathematics, and scientific tasks. It includes built-in "thinking" capabilities, enabling it to provide responses with greater...

$0.3 / 1M tokensInput$2.50 / 1M tokensOutput1MContext

Best forlong-document summarization, image Q&A, fast multimodal routing

Routingconfigured

Tool callingVisionJSON modeLong contextStreamingLow cost

OpenRouter if availableOpenRouter public Models API live metadata; public price comes from the registry pricing rule

View details

Alibaba Cloud / QwenCatalog

Qwen: Qwen3 Max

Qwen3-Max is an updated release built on the Qwen3 series, offering major improvements in reasoning, instruction following, multilingual support, and long-tail knowledge coverage compared to the January 2025 version. It...

$0.78 / 1M tokensInput$3.90 / 1M tokensOutput262.1kContext

Best forChinese agent workflows, business analysis, structured output

Routingconfigured

Tool callingJSON modeLong contextReasoningStreaming

OpenRouter if availableOpenRouter public Models API live metadata; public price comes from the registry pricing rule

View details

OpenRouterCatalog

OpenAI: GPT-4o-mini

GPT-4o mini is OpenAI's newest model after [GPT-4 Omni](/models/openai/gpt-4o), supporting both text and image inputs with text outputs. As their most advanced small model, it is many multiples more affordable...

$0.15 / 1M tokensInput$0.6 / 1M tokensOutput128kContext

Best forlow-cost chat, image understanding, classification

Routingconfigured

Tool callingVisionJSON modeLong contextStreamingLow cost

OpenRouter if availableOpenRouter public Models API live metadata; public price comes from the registry pricing rule

View details

Moonshot AICatalog

MoonshotAI: Kimi K2.6

Kimi K2.6 is Moonshot AI's next-generation multimodal model, designed for long-horizon coding, coding-driven UI/UX generation, and multi-agent orchestration. It handles complex end-to-end coding tasks across Python, Rust, and Go, and...

$0.73 / 1M tokensInput$3.49 / 1M tokensOutput262.1kContext

Best forlong Chinese documents, contract review, knowledge-base Q&A

Routingconfigured

JSON modeLong contextStreamingTool callingVision

OpenRouter if availableOpenRouter public Models API live metadata; public price comes from the registry pricing rule

View details

MetaCatalog

Meta: Llama 4 Maverick

Llama 4 Maverick 17B Instruct (128E) is a high-capacity multimodal language model from Meta, built on a mixture-of-experts (MoE) architecture with 128 experts and 17 billion active parameters per forward...

$0.15 / 1M tokensInput$0.6 / 1M tokensOutput1MContext

Best foropen-model workflows, cost-sensitive long context, classification

Routingconfigured

JSON modeLong contextStreamingLow costVision

OpenRouter if availableOpenRouter public Models API live metadata; public price comes from the registry pricing rule

View details

Mistral AICatalog

Mistral: Mistral Small 3.2 24B

Mistral-Small-3.2-24B-Instruct-2506 is an updated 24B parameter model from Mistral optimized for instruction following, repetition reduction, and improved function calling. Compared to the 3.1 release, version 3.2 significantly improves accuracy on...

$0.1 / 1M tokensInput$0.3 / 1M tokensOutput128kContext

Best fortranslation, classification, short-form summarization

Routingconfigured

Tool callingJSON modeStreamingLow costVisionLong context

OpenRouter if availableOpenRouter public Models API live metadata; public price comes from the registry pricing rule

View details

Decision table

Compare price, context, capabilities, status, and source in one scan.

Use the table when you are narrowing a shortlist for production tests, cost estimates, or provider policy decisions.

Model	Provider	Input	Output	Context	Capabilities	Best for	Latency	Status	Source
Anthropic: Claude Opus 4.7anthropic/claude-opus-4.7	Anthropic	$5 / 1M tokens	$25 / 1M tokens	1M	Tool callingJSON modeLong contextReasoning	frontier reasoning, large codebase review	2300-6800ms	Catalog	OpenRouter if available
Anthropic: Claude Sonnet 4.5anthropic/claude-sonnet-4.5	Anthropic	$3 / 1M tokens	$15 / 1M tokens	1M	Tool callingJSON modeLong contextReasoning	coding agents, code review	1600-4800ms	Catalog	OpenRouter if available
Google: Gemini 2.5 Progoogle/gemini-2.5-pro	Google	$1.25 / 1M tokens	$10 / 1M tokens	1M	Tool callingVisionJSON modeLong context	long-context analysis, vision workflows	1500-5000ms	Catalog	OpenRouter if available
DeepSeek: R1deepseek/deepseek-r1	DeepSeek	$0.7 / 1M tokens	$2.50 / 1M tokens	163.8k	JSON modeLong contextReasoningStreaming	Chinese reasoning, math	1800-6000ms	Catalog	OpenRouter if available
Doubao Seed 2.0 Minidoubao-seed-2-0-mini	Volcengine	¥0.2 / 1M tokens	¥2 / 1M tokens	128k	Tool callingVisionJSON modeLong context	Chinese Q&A, low-cost general chat	900-2600ms	Production	Platform curated
DeepSeek: DeepSeek V4 Flashdeepseek/deepseek-v4-flash	DeepSeek	$0.112 / 1M tokens	$0.224 / 1M tokens	1M	Tool callingJSON modeLong contextReasoning	low-cost Chinese tasks, long-context summary	800-2600ms	Catalog	OpenRouter if available
Qwen: Qwen3 Coder Plusqwen/qwen3-coder-plus	Alibaba Cloud / Qwen	$0.65 / 1M tokens	$3.25 / 1M tokens	1M	Tool callingJSON modeLong contextStreaming	Chinese engineering workflows, code generation	1200-3900ms	Catalog	OpenRouter if available
Google: Gemini 2.5 Flashgoogle/gemini-2.5-flash	Google	$0.3 / 1M tokens	$2.50 / 1M tokens	1M	Tool callingVisionJSON modeLong context	long-document summarization, image Q&A	900-2800ms	Catalog	OpenRouter if available
Qwen: Qwen3 Maxqwen/qwen3-max	Alibaba Cloud / Qwen	$0.78 / 1M tokens	$3.90 / 1M tokens	262.1k	Tool callingJSON modeLong contextReasoning	Chinese agent workflows, business analysis	1300-4200ms	Catalog	OpenRouter if available
OpenAI: GPT-4o-miniopenai/gpt-4o-mini	OpenRouter	$0.15 / 1M tokens	$0.6 / 1M tokens	128k	Tool callingVisionJSON modeLong context	low-cost chat, image understanding	800-2400ms	Catalog	OpenRouter if available
MoonshotAI: Kimi K2.6moonshotai/kimi-k2.6	Moonshot AI	$0.73 / 1M tokens	$3.49 / 1M tokens	262.1k	JSON modeLong contextStreamingTool calling	long Chinese documents, contract review	1400-4400ms	Catalog	OpenRouter if available
Meta: Llama 4 Maverickmeta-llama/llama-4-maverick	Meta	$0.15 / 1M tokens	$0.6 / 1M tokens	1M	JSON modeLong contextStreamingLow cost	open-model workflows, cost-sensitive long context	950-2800ms	Catalog	OpenRouter if available
Mistral: Mistral Small 3.2 24Bmistralai/mistral-small-3.2-24b-instruct	Mistral AI	$0.1 / 1M tokens	$0.3 / 1M tokens	128k	Tool callingJSON modeStreamingLow cost	translation, classification	700-2300ms	Catalog	OpenRouter if available

42 models,one endpoint.

Model cards with source labels and copyable OpenAI-compatible calls.

Compare price, context, capabilities, status, and source in one scan.

42 models,
one endpoint.