Model shortlist

Best agent model APIs for tool-calling workflows

Compare model APIs for agent workflows that need tool calling, JSON mode, long context, and budget policies.

What is this shortlist for?

Agent workflows are output-heavy and can become expensive quickly. Teams should compare tool calling, JSON support, context length, latency, and output price before routing agent tasks to a model.

Source basis: NextModel capability mapping and supported-parameter metadata when available. · Updated 2026-07-01

Fit score

Recommended candidates agent models

Start with the shortlist, then test real prompts and compare monthly cost before production routing.

AnthropicCatalog

Anthropic: Claude Opus 4.7

Opus 4.7 is the next generation of Anthropic's Opus family, built for long-running, asynchronous agents. Building on the coding and agentic strengths of Opus 4.6, it delivers stronger performance on...

$5 / 1M tokensInput$25 / 1M tokensOutput1MContext

Best forfrontier reasoning, large codebase review, strategy analysis

RoutingConfigured

Tool callingJSON modeLong contextReasoningStreamingVision

OpenRouter if availableOpenRouter public Models API live metadata; public price comes from the registry pricing rule

View details

AnthropicCatalog

Anthropic: Claude Sonnet 4.5

Claude Sonnet 4.5 is Anthropic’s most advanced Sonnet model to date, optimized for real-world agents and coding workflows. It delivers state-of-the-art performance on coding benchmarks such as SWE-bench Verified, with...

$3 / 1M tokensInput$15 / 1M tokensOutput1MContext

Best forcoding agents, code review, complex writing

RoutingConfigured

Tool callingJSON modeLong contextReasoningStreamingVision

OpenRouter if availableOpenRouter public Models API live metadata; public price comes from the registry pricing rule

View details

GoogleCatalog

Google: Gemini 2.5 Pro

Gemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reasoning, coding, mathematics, and scientific tasks. It employs “thinking” capabilities, enabling it to reason through responses with enhanced accuracy...

$1.25 / 1M tokensInput$10 / 1M tokensOutput1MContext

Best forlong-context analysis, vision workflows, scientific reasoning

RoutingConfigured

Tool callingVisionJSON modeLong contextReasoningStreaming

OpenRouter if availableOpenRouter public Models API live metadata; public price comes from the registry pricing rule

View details

Alibaba Cloud / QwenCatalog

Qwen: Qwen3 Coder Plus

Qwen3 Coder Plus is Alibaba's proprietary version of the Open Source Qwen3 Coder 480B A35B. It is a powerful coding agent model specializing in autonomous programming via tool calling and...

$0.65 / 1M tokensInput$3.25 / 1M tokensOutput1MContext

Best forChinese engineering workflows, code generation, codebase Q&A

RoutingConfigured

Tool callingJSON modeLong contextStreaming

OpenRouter if availableOpenRouter public Models API live metadata; public price comes from the registry pricing rule

View details

Comparison table

Compare the shortlist by price, provider, context, capability, and source.

Use this view when you're narrowing a production shortlist, building a fallback policy, or comparing model economics.

Model	Provider	Input	Output	Context	Capabilities	Best for	Latency	Status	Source
Anthropic: Claude Opus 4.7anthropic/claude-opus-4.7	Anthropic	$5 / 1M tokens	$25 / 1M tokens	1M	Tool callingJSON modeLong contextReasoning	frontier reasoning, large codebase review	2300-6800ms	Catalog	OpenRouter if available
Anthropic: Claude Sonnet 4.5anthropic/claude-sonnet-4.5	Anthropic	$3 / 1M tokens	$15 / 1M tokens	1M	Tool callingJSON modeLong contextReasoning	coding agents, code review	1600-4800ms	Catalog	OpenRouter if available
Google: Gemini 2.5 Progoogle/gemini-2.5-pro	Google	$1.25 / 1M tokens	$10 / 1M tokens	1M	Tool callingVisionJSON modeLong context	long-context analysis, vision workflows	1500-5000ms	Catalog	OpenRouter if available
Qwen: Qwen3 Coder Plusqwen/qwen3-coder-plus	Alibaba Cloud / Qwen	$0.65 / 1M tokens	$3.25 / 1M tokens	1M	Tool callingJSON modeLong contextStreaming	Chinese engineering workflows, code generation	1200-3900ms	Catalog	OpenRouter if available
DeepSeek V4 Prodeepseek-v4-pro	DeepSeek	$1.74 / 1M tokens	$3.47 / 1M tokens	128k	Tool callingJSON modeStreamingReasoning	complex reasoning, agentic coding	1100-3400ms	Production	Platform curated
Qwen: Qwen3 Maxqwen/qwen3-max	Alibaba Cloud / Qwen	$0.78 / 1M tokens	$3.90 / 1M tokens	262.1k	Tool callingJSON modeLong contextReasoning	Chinese agent workflows, business analysis	1300-4200ms	Catalog	OpenRouter if available
Doubao Seed 2.0 Prodoubao-seed-2-0-pro	Volcengine	$0.463 / 1M tokens	$2.31 / 1M tokens	256k	Tool callingVisionJSON modeLong context	general-purpose reasoning, multimodal analysis	1000-3200ms	Production	Platform curated
Doubao Seed 2.0 Codedoubao-seed-2-0-code	Volcengine	$0.463 / 1M tokens	$2.31 / 1M tokens	256k	Tool callingJSON modeLong contextStreaming	agentic coding, repository-scale refactors	1000-3200ms	Production	Platform curated

FAQ

Agent models FAQ

Which capabilities matter most for agent models?

Tool calling, structured JSON output, long context, and reliable instruction following matter most.

All models Pricing calculator OpenAI-compatible quickstart