Updated February 2026 — 40+ models

AI API Cost Comparison Calculator

Compare pricing across 40+ AI models from OpenAI, Anthropic, Google, Meta, Mistral, and more. Adjust your usage below to estimate monthly costs instantly.

Potential Savings

Switching from GPT-5.2 Pro to Mistral Nemo

$314.88/month
77 models1K in / 500 out100 req/day
Mistral NemoBest Value
Mistralbudget

$0.120

/month

Input $/1M

$0.02

Output $/1M

$0.04

Context

128K

Latencyfast
Llama 3.2 3B
Metabudget

$0.270

/month

Input $/1M

$0.06

Output $/1M

$0.06

Context

131K

Latencyfast
GPT-OSS 20B
OpenAIbudget

$0.300

/month

Input $/1M

$0.03

Output $/1M

$0.14

Context

128K

Latencyfast
Amazon Nova Micro
Amazonbudget

$0.315

/month

Input $/1M

$0.04

Output $/1M

$0.14

Context

128K

Latencyfast
Command R7B
Coherebudget

$0.337

/month

Input $/1M

$0.04

Output $/1M

$0.15

Context

128K

Latencyfast
GPT-OSS 120B
OpenAImid

$0.402

/month

Input $/1M

$0.04

Output $/1M

$0.19

Context

128K

Latencymedium
Mistral Small 3.2
Mistralbudget

$0.450

/month

Input $/1M

$0.06

Output $/1M

$0.18

Context

32K

Latencyfast
Ministral 3 3B
Mistralbudget

$0.450

/month

Input $/1M

$0.10

Output $/1M

$0.10

Context

128K

Latencyfast
Amazon Nova Lite
Amazonbudget

$0.540

/month

Input $/1M

$0.06

Output $/1M

$0.24

Context

300K

Latencyfast
Gemini 2.0 Flash-Lite
Googlebudget

$0.675

/month

Input $/1M

$0.07

Output $/1M

$0.30

Context

1.0M

Latencyfast
Ministral 3 8B
Mistralbudget

$0.675

/month

Input $/1M

$0.15

Output $/1M

$0.15

Context

128K

Latencyfast
GPT-5 Nano
OpenAIbudget

$0.750

/month

Input $/1M

$0.05

Output $/1M

$0.40

Context

128K

Latencyfast
Llama 3.2 90B Vision
Metamid

$0.810

/month

Input $/1M

$0.18

Output $/1M

$0.18

Context

131K

Latencyfast
Llama 3.2 11B Vision
Metabudget

$0.810

/month

Input $/1M

$0.18

Output $/1M

$0.18

Context

131K

Latencyfast
GPT-4.1 Nano
OpenAIbudget

$0.900

/month

Input $/1M

$0.10

Output $/1M

$0.40

Context

1.0M

Latencyfast
Gemini 2.5 Flash-Lite
Googlebudget

$0.900

/month

Input $/1M

$0.10

Output $/1M

$0.40

Context

1.0M

Latencyfast
Gemini 2.0 Flash
Googlebudget

$0.900

/month

Input $/1M

$0.10

Output $/1M

$0.40

Context

1.0M

Latencyfast
Ministral 3 14B
Mistralbudget

$0.900

/month

Input $/1M

$0.20

Output $/1M

$0.20

Context

128K

Latencyfast
GPT-4o Mini
OpenAIbudget

$1.35

/month

Input $/1M

$0.15

Output $/1M

$0.60

Context

128K

Latencyfast
Command R
Coheremid

$1.35

/month

Input $/1M

$0.15

Output $/1M

$0.60

Context

128K

Latencyfast
Grok 4 Fast
xAIbudget

$1.35

/month

Input $/1M

$0.20

Output $/1M

$0.50

Context

2.0M

Latencyfast
Grok 4.1 Fast
xAIbudget

$1.35

/month

Input $/1M

$0.20

Output $/1M

$0.50

Context

2.0M

Latencyfast
Llama 4 Scout
Metamid

$1.43

/month

Input $/1M

$0.18

Output $/1M

$0.59

Context

328K

Latencyfast
DeepSeek V3.2
DeepSeekmid

$1.47

/month

Input $/1M

$0.28

Output $/1M

$0.42

Context

128K

Latencyfast
Qwen 3 235B-A22B
Together AImid

$1.50

/month

Input $/1M

$0.20

Output $/1M

$0.60

Context

262K

Latencyfast
DeepSeek V3.1
DeepSeekmid

$1.57

/month

Input $/1M

$0.15

Output $/1M

$0.75

Context

128K

Latencyfast
Grok 3 Mini
xAImid

$1.65

/month

Input $/1M

$0.30

Output $/1M

$0.50

Context

131K

Latencyfast
Llama 4 Maverick
Metaflagship

$2.08

/month

Input $/1M

$0.27

Output $/1M

$0.85

Context

1.0M

Latencyfast
Codestral
Mistralmid

$2.25

/month

Input $/1M

$0.30

Output $/1M

$0.90

Context

256K

Latencyfast
DeepSeek V3
DeepSeekmid

$2.29

/month

Input $/1M

$0.32

Output $/1M

$0.89

Context

128K

Latencyfast
Claude 3 Haiku
Anthropicbudget

$2.63

/month

Input $/1M

$0.25

Output $/1M

$1.25

Context

200K

Latencyfast
Grok Code Fast 1
xAImid

$2.85

/month

Input $/1M

$0.20

Output $/1M

$1.50

Context

256K

Latencyfast
Qwen 3 Coder Next
Together AImid

$3.30

/month

Input $/1M

$0.50

Output $/1M

$1.20

Context

262K

Latencyfast
GPT-4.1 Mini
OpenAImid

$3.60

/month

Input $/1M

$0.40

Output $/1M

$1.60

Context

1.0M

Latencyfast
GPT-5 Mini
OpenAImid

$3.75

/month

Input $/1M

$0.25

Output $/1M

$2.00

Context

128K

Latencyfast
GPT-5.1 Codex Mini
OpenAImid

$3.75

/month

Input $/1M

$0.25

Output $/1M

$2.00

Context

128K

Latencyfast
Mistral Large 3 (2512)
Mistralmid

$3.75

/month

Input $/1M

$0.50

Output $/1M

$1.50

Context

128K

Latencyfast
Llama 3.3 70B
Metamid

$3.96

/month

Input $/1M

$0.88

Output $/1M

$0.88

Context

128K

Latencyfast
Mistral Medium 3.1
Mistralflagship

$4.20

/month

Input $/1M

$0.40

Output $/1M

$2.00

Context

131K

Latencyfast
Devstral 2 (2512)
Mistralmid

$4.20

/month

Input $/1M

$0.40

Output $/1M

$2.00

Context

131K

Latencyfast
Gemini 2.5 Flash
Googlemid

$4.65

/month

Input $/1M

$0.30

Output $/1M

$2.50

Context

1.0M

Latencyfast
Amazon Nova 2 Lite
Amazonmid

$4.65

/month

Input $/1M

$0.30

Output $/1M

$2.50

Context

1.0M

Latencyfast
DeepSeek R1
DeepSeekmid

$5.85

/month

Input $/1M

$0.70

Output $/1M

$2.50

Context

128K

Latencymedium
Gemini 3 Flash Preview
Googlemid

$6.00

/month

Input $/1M

$0.50

Output $/1M

$3.00

Context

1.0M

Latencyfast
Qwen 3.5 397B-A17B
Together AIflagship

$7.20

/month

Input $/1M

$0.60

Output $/1M

$3.60

Context

262K

Latencymedium
Amazon Nova Pro
Amazonmid

$7.20

/month

Input $/1M

$0.80

Output $/1M

$3.20

Context

300K

Latencyfast
Claude Haiku 3.5
Anthropicbudget

$8.40

/month

Input $/1M

$0.80

Output $/1M

$4.00

Context

200K

Latencyfast
o3 Mini
OpenAImid

$9.90

/month

Input $/1M

$1.10

Output $/1M

$4.40

Context

200K

Latencymedium
o4 Mini
OpenAImid

$9.90

/month

Input $/1M

$1.10

Output $/1M

$4.40

Context

200K

Latencymedium
Claude Haiku 4.5
Anthropicbudget

$10.50

/month

Input $/1M

$1.00

Output $/1M

$5.00

Context

200K

Latencyfast
Mistral Large 2411
Mistralflagship

$15.00

/month

Input $/1M

$2.00

Output $/1M

$6.00

Context

128K

Latencyfast
GPT-4.1
OpenAIflagship

$18.00

/month

Input $/1M

$2.00

Output $/1M

$8.00

Context

1.0M

Latencyfast
o3
OpenAIflagship

$18.00

/month

Input $/1M

$2.00

Output $/1M

$8.00

Context

200K

Latencyslow
GPT-5
OpenAIflagship

$18.75

/month

Input $/1M

$1.25

Output $/1M

$10.00

Context

128K

Latencymedium
GPT-5.1
OpenAIflagship

$18.75

/month

Input $/1M

$1.25

Output $/1M

$10.00

Context

128K

Latencymedium
Gemini 2.5 Pro
Googleflagship

$18.75

/month

Input $/1M

$1.25

Output $/1M

$10.00

Context

1.0M

Latencymedium
Grok 2
xAImid

$21.00

/month

Input $/1M

$2.00

Output $/1M

$10.00

Context

33K

Latencyfast
GPT-4o
OpenAIflagship

$22.50

/month

Input $/1M

$2.50

Output $/1M

$10.00

Context

128K

Latencyfast
Command R+
Cohereflagship

$22.50

/month

Input $/1M

$2.50

Output $/1M

$10.00

Context

128K

Latencymedium
Command A
Cohereflagship

$22.50

/month

Input $/1M

$2.50

Output $/1M

$10.00

Context

256K

Latencymedium
Gemini 3.1 Pro Preview
Googleflagship

$24.00

/month

Input $/1M

$2.00

Output $/1M

$12.00

Context

1.0M

Latencymedium
GPT-5.2
OpenAIflagship

$26.25

/month

Input $/1M

$1.75

Output $/1M

$14.00

Context

128K

Latencymedium
Amazon Nova Premier
Amazonflagship

$26.25

/month

Input $/1M

$2.50

Output $/1M

$12.50

Context

1.0M

Latencymedium
Claude Sonnet 4.6
Anthropicmid

$31.50

/month

Input $/1M

$3.00

Output $/1M

$15.00

Context

200K

Latencyfast
Claude Sonnet 4.5
Anthropicmid

$31.50

/month

Input $/1M

$3.00

Output $/1M

$15.00

Context

200K

Latencyfast
Claude Sonnet 4
Anthropicmid

$31.50

/month

Input $/1M

$3.00

Output $/1M

$15.00

Context

200K

Latencyfast
Claude Sonnet 3.7
Anthropicmid

$31.50

/month

Input $/1M

$3.00

Output $/1M

$15.00

Context

200K

Latencyfast
Grok 4
xAIflagship

$31.50

/month

Input $/1M

$3.00

Output $/1M

$15.00

Context

256K

Latencymedium
Grok 3
xAIflagship

$31.50

/month

Input $/1M

$3.00

Output $/1M

$15.00

Context

131K

Latencymedium
Claude Opus 4.6
Anthropicflagship

$52.50

/month

Input $/1M

$5.00

Output $/1M

$25.00

Context

200K

Latencymedium
Claude Opus 4.5
Anthropicflagship

$52.50

/month

Input $/1M

$5.00

Output $/1M

$25.00

Context

200K

Latencymedium
o1
OpenAIflagship

$135.00

/month

Input $/1M

$15.00

Output $/1M

$60.00

Context

200K

Latencyslow
Claude Opus 4.1
Anthropicflagship

$157.50

/month

Input $/1M

$15.00

Output $/1M

$75.00

Context

200K

Latencyslow
Claude Opus 4
Anthropicflagship

$157.50

/month

Input $/1M

$15.00

Output $/1M

$75.00

Context

200K

Latencyslow
o3 Pro
OpenAIflagship

$180.00

/month

Input $/1M

$20.00

Output $/1M

$80.00

Context

200K

Latencyslow
GPT-5 Pro
OpenAIflagship

$225.00

/month

Input $/1M

$15.00

Output $/1M

$120.00

Context

128K

Latencyslow
GPT-5.2 Pro
OpenAIflagship

$315.00

/month

Input $/1M

$21.00

Output $/1M

$168.00

Context

128K

Latencyslow

Providers Covered

Pricing data sourced directly from official documentation and verified monthly.

OpenAIAnthropicGoogleDeepSeekMistralCohereMetaxAIAmazonTogether AI

Understanding AI API Pricing in 2026

As large language models become integral to software products, understanding the cost of AI APIs is critical for engineering teams and product managers. Every major provider — OpenAI, Anthropic, Google, Meta, Mistral, Cohere, DeepSeek, xAI, and Amazon — prices their APIs per token, but the rates vary dramatically depending on model capability, latency, and whether you use standard or batch endpoints.

Token-based pricing means you pay separately for the text you send to the model (input tokens) and the text it generates (output tokens). Output tokens are typically 2 to 5 times more expensive than input tokens because they require more computation. Batch APIs, offered by providers like OpenAI and Anthropic, let you queue requests for asynchronous processing at roughly half the standard rate — ideal for offline workloads such as data labeling, summarization pipelines, and evaluation runs.

Choosing the right model requires balancing cost against quality, latency, and feature support. A chatbot handling millions of short messages needs a different model than a coding assistant working with long context windows. Batch processing can cut costs by 50% for non-real-time workloads, and prompt caching further reduces input token costs for providers that support it. Read our complete guide to AI API pricing in 2026 for a deep dive into how token pricing works across every provider.

For most production applications, the best strategy is to route different tasks to different models: use a flagship model like Claude Opus 4, GPT-4.1, or Gemini 2.5 Pro for complex reasoning, and a budget model like GPT-4.1 Nano, Gemini 2.0 Flash, or Amazon Nova Micro for simpler classification or extraction tasks. This tiered approach can reduce your monthly API bill by 80% or more without sacrificing quality where it matters. Explore our 5 proven strategies to reduce AI API costs to learn more.

Frequently Asked Questions

How is the monthly cost calculated?+

Monthly cost is calculated as: (input tokens per request x requests per day x 30 days / 1,000,000 x input rate) + (output tokens per request x requests per day x 30 days / 1,000,000 x output rate). When batch pricing is enabled, the batch input and output rates are used instead of standard rates for models that support it. This gives you a realistic estimate of what your monthly bill will look like at steady-state usage.

What is the difference between input and output tokens?+

Input tokens are the tokens in the text you send to the model -- this includes your system prompt, user message, and any context you provide. Output tokens are the tokens the model generates in its response. Output tokens are more expensive because they require sequential computation (each token depends on the previous one), while input tokens can be processed in parallel. A typical English word is roughly 1.3 tokens.

What is batch pricing and when should I use it?+

Batch pricing lets you submit a collection of API requests that are processed asynchronously, typically within a 24-hour window. In return, you get a significant discount -- usually 50% off standard pricing. This is ideal for workloads that do not require real-time responses, such as bulk document processing, dataset annotation, evaluation benchmarks, and nightly content generation pipelines. Not all providers offer batch endpoints; OpenAI and Anthropic are the most prominent supporters.

How often is the pricing data updated?+

We verify pricing data directly from each provider's official pricing page on a regular basis. AI API pricing changes frequently -- providers often lower prices as they optimize infrastructure, and new models launch with different price points. Each model entry includes the date it was last verified. If you notice a discrepancy, please let us know so we can update it promptly.

Which model should I choose for my project?+

It depends on your priorities. If you need the highest quality reasoning and can afford it, flagship models like Claude Opus 4, o3, or Gemini 2.5 Pro deliver the best results. For latency- sensitive applications (chatbots, autocomplete), look at fast-tier models like GPT-4.1 Mini, Claude Sonnet 4, or Gemini 2.5 Flash. For high-volume, cost-sensitive workloads where quality can be slightly lower, budget models like GPT-4.1 Nano, Amazon Nova Micro, or Mistral Small can cut costs by 90% or more. Many teams use a routing strategy that sends easy tasks to cheap models and hard tasks to premium ones.

Last updated: February 2026