/ rankings / jp

モデルランキング。

プロダクトやプラットフォームのチームが素早くモデルを決めたいときに、ソースラベルとコスト文脈付きで使える短い候補一覧です。

この AI モデルランキングはどう使うべきですか？

これは絶対的な順位表ではなく、意思決定のための候補一覧です。各ページは実務的なワークロードごとにモデルをまとめ、コストとソース情報を添えています。

Blended price

コスト重視プロダクト向けの低価格 LLM API モデル

入力価格、出力価格、コンテキスト長、機能、ソース、運用適性で低コスト LLM API モデルを比較します。

ランキングを見る

DeepSeek: DeepSeek V4 Flash

low-cost Chinese tasks, long-context summary, batch code assistance

DeepSeekBlended price: USD 0.336/1M

Mistral: Mistral Small 3.2 24B

translation, classification, short-form summarization

Mistral AIBlended price: USD 0.400/1M

OpenAI: GPT-4o-mini

low-cost chat, image understanding, classification

OpenRouterBlended price: USD 0.750/1M

Meta: Llama 4 Maverick

open-model workflows, cost-sensitive long context, classification

MetaBlended price: USD 0.750/1M

Google: Gemini 2.5 Flash

long-document summarization, image Q&A, fast multimodal routing

GoogleBlended price: USD 2.800/1M

MoonshotAI: Kimi K2.6

long Chinese documents, contract review, knowledge-base Q&A

Moonshot AIBlended price: USD 4.220/1M

Fit score

Best Chinese LLM API models for developer teams

Compare Chinese-language LLM API candidates across domestic and global providers, including pricing, context, latency estimates, and best use cases.

ランキングを見る

DeepSeek: R1

Chinese reasoning, math, analysis

DeepSeekFit score: 89/100

DeepSeek: DeepSeek V4 Flash

low-cost Chinese tasks, long-context summary, batch code assistance

DeepSeekFit score: 88/100

Qwen: Qwen3 Coder Plus

Chinese engineering workflows, code generation, codebase Q&A

Alibaba Cloud / QwenFit score: 87/100

Qwen: Qwen3 Max

Chinese agent workflows, business analysis, structured output

Alibaba Cloud / QwenFit score: 86/100

MoonshotAI: Kimi K2.6

long Chinese documents, contract review, knowledge-base Q&A

Moonshot AIFit score: 84/100

Fit score

Agent とコードレビュー向けのコーディングモデル API

コンテキスト長、tool 対応、JSON 出力、レイテンシ、価格、推奨ロールでコーディング向けモデル API を比較します。

ランキングを見る

Anthropic: Claude Opus 4.7

frontier reasoning, large codebase review, strategy analysis

AnthropicFit score: 96/100

Anthropic: Claude Sonnet 4.5

coding agents, code review, complex writing

AnthropicFit score: 93/100

DeepSeek: R1

Chinese reasoning, math, analysis

DeepSeekFit score: 89/100

Doubao Seed 2.0 Mini

Coding

VolcengineFit score: 88/100

DeepSeek: DeepSeek V4 Flash

low-cost Chinese tasks, long-context summary, batch code assistance

DeepSeekFit score: 88/100

Qwen: Qwen3 Coder Plus

Chinese engineering workflows, code generation, codebase Q&A

Alibaba Cloud / QwenFit score: 87/100

Fit score

Best vision model APIs for image understanding

Compare vision-capable model APIs for image understanding, document screenshots, multimodal support workflows, and cost-sensitive routing.

ランキングを見る

Anthropic: Claude Opus 4.7

frontier reasoning, large codebase review, strategy analysis

AnthropicFit score: 96/100

Anthropic: Claude Sonnet 4.5

coding agents, code review, complex writing

AnthropicFit score: 93/100

Google: Gemini 2.5 Pro

long-context analysis, vision workflows, scientific reasoning

GoogleFit score: 91/100

Google: Gemini 2.5 Flash

long-document summarization, image Q&A, fast multimodal routing

GoogleFit score: 86/100

OpenAI: GPT-4o-mini

low-cost chat, image understanding, classification

OpenRouterFit score: 84/100

MoonshotAI: Kimi K2.6

long Chinese documents, contract review, knowledge-base Q&A

Moonshot AIFit score: 84/100

Catalog activity

OpenRouter alternatives for teams that need cost control

Compare OpenRouter-style multi-model access with cost governance, domestic provider coverage, BYOK, budget controls, and team usage reporting.

ランキングを見る

OpenAI: GPT-4o-mini

low-cost chat, image understanding, classification

OpenRouterCatalog activity: 93/100

Anthropic: Claude Sonnet 4.5

coding agents, code review, complex writing

AnthropicCatalog activity: 92/100

Anthropic: Claude Opus 4.7

frontier reasoning, large codebase review, strategy analysis

AnthropicCatalog activity: 90/100

DeepSeek: R1

Chinese reasoning, math, analysis

DeepSeekCatalog activity: 89/100

Google: Gemini 2.5 Pro

long-context analysis, vision workflows, scientific reasoning

GoogleCatalog activity: 88/100

Google: Gemini 2.5 Flash

long-document summarization, image Q&A, fast multimodal routing

GoogleCatalog activity: 86/100

Context

Best long-context model APIs for large documents

Compare long-context model APIs by context window, price, model source, and recommended document-heavy use cases.

ランキングを見る

Google: Gemini 2.5 Pro

long-context analysis, vision workflows, scientific reasoning

GoogleContext: 1049k tokens

DeepSeek: DeepSeek V4 Flash

low-cost Chinese tasks, long-context summary, batch code assistance

DeepSeekContext: 1049k tokens

Google: Gemini 2.5 Flash

long-document summarization, image Q&A, fast multimodal routing

GoogleContext: 1049k tokens

Meta: Llama 4 Maverick

open-model workflows, cost-sensitive long context, classification

MetaContext: 1049k tokens

Anthropic: Claude Opus 4.7

frontier reasoning, large codebase review, strategy analysis

AnthropicContext: 1000k tokens

Anthropic: Claude Sonnet 4.5

coding agents, code review, complex writing

AnthropicContext: 1000k tokens

Fit score

Best agent model APIs for tool-calling workflows

Compare model APIs for agent workflows that need tool calling, JSON mode, long context, and budget policies.

ランキングを見る

Anthropic: Claude Opus 4.7

frontier reasoning, large codebase review, strategy analysis

AnthropicFit score: 96/100

Anthropic: Claude Sonnet 4.5

coding agents, code review, complex writing

AnthropicFit score: 93/100

Google: Gemini 2.5 Pro

long-context analysis, vision workflows, scientific reasoning

GoogleFit score: 91/100

Qwen: Qwen3 Coder Plus

Chinese engineering workflows, code generation, codebase Q&A

Alibaba Cloud / QwenFit score: 87/100

Qwen: Qwen3 Max

Chinese agent workflows, business analysis, structured output

Alibaba Cloud / QwenFit score: 86/100

OpenAI: GPT-4o-mini

low-cost chat, image understanding, classification

OpenRouterFit score: 84/100