Updated February 2026 — 40+ models

AI API Cost Comparison Calculator

Q: Which model should I choose for my project?

It depends on your priorities. For highest quality reasoning, use flagship models like Claude Opus 4, o3, or Gemini 2.5 Pro. For latency-sensitive applications, look at fast-tier models like GPT-4.1 Mini or Gemini 2.5 Flash. For high-volume cost-sensitive workloads, budget models like GPT-4.1 Nano or Amazon Nova Micro can cut costs by 90% or more.

Compare pricing across 40+ AI models from OpenAI, Anthropic, Google, Meta, Mistral, and more. Adjust your usage below to estimate monthly costs instantly.

Potential Savings

Switching from GPT-5.2 Pro to Mistral Nemo

$314.88/month

77 models1K in / 500 out100 req/day

Model	Provider	Input $/1M	Output $/1M	Monthly Cost	Context	Latency
Mistral NemoBest Valuebudget	Mistral	$0.02	$0.04	$0.120	128K	fast
Llama 3.2 3Bbudget	Meta	$0.06	$0.06	$0.270	131K	fast
GPT-OSS 20Bbudget	OpenAI	$0.03	$0.14	$0.300	128K	fast
Amazon Nova Microbudget	Amazon	$0.04	$0.14	$0.315	128K	fast
Command R7Bbudget	Cohere	$0.04	$0.15	$0.337	128K	fast
GPT-OSS 120Bmid	OpenAI	$0.04	$0.19	$0.402	128K	medium
Mistral Small 3.2budget	Mistral	$0.06	$0.18	$0.450	32K	fast
Ministral 3 3Bbudget	Mistral	$0.10	$0.10	$0.450	128K	fast
Amazon Nova Litebudget	Amazon	$0.06	$0.24	$0.540	300K	fast
Gemini 2.0 Flash-Litebudget	Google	$0.07	$0.30	$0.675	1.0M	fast
Ministral 3 8Bbudget	Mistral	$0.15	$0.15	$0.675	128K	fast
GPT-5 Nanobudget	OpenAI	$0.05	$0.40	$0.750	128K	fast
Llama 3.2 90B Visionmid	Meta	$0.18	$0.18	$0.810	131K	fast
Llama 3.2 11B Visionbudget	Meta	$0.18	$0.18	$0.810	131K	fast
GPT-4.1 Nanobudget	OpenAI	$0.10	$0.40	$0.900	1.0M	fast
Gemini 2.5 Flash-Litebudget	Google	$0.10	$0.40	$0.900	1.0M	fast
Gemini 2.0 Flashbudget	Google	$0.10	$0.40	$0.900	1.0M	fast
Ministral 3 14Bbudget	Mistral	$0.20	$0.20	$0.900	128K	fast
GPT-4o Minibudget	OpenAI	$0.15	$0.60	$1.35	128K	fast
Command Rmid	Cohere	$0.15	$0.60	$1.35	128K	fast
Grok 4 Fastbudget	xAI	$0.20	$0.50	$1.35	2.0M	fast
Grok 4.1 Fastbudget	xAI	$0.20	$0.50	$1.35	2.0M	fast
Llama 4 Scoutmid	Meta	$0.18	$0.59	$1.43	328K	fast
DeepSeek V3.2mid	DeepSeek	$0.28	$0.42	$1.47	128K	fast
Qwen 3 235B-A22Bmid	Together AI	$0.20	$0.60	$1.50	262K	fast
DeepSeek V3.1mid	DeepSeek	$0.15	$0.75	$1.57	128K	fast
Grok 3 Minimid	xAI	$0.30	$0.50	$1.65	131K	fast
Llama 4 Maverickflagship	Meta	$0.27	$0.85	$2.08	1.0M	fast
Codestralmid	Mistral	$0.30	$0.90	$2.25	256K	fast
DeepSeek V3mid	DeepSeek	$0.32	$0.89	$2.29	128K	fast
Claude 3 Haikubudget	Anthropic	$0.25	$1.25	$2.63	200K	fast
Grok Code Fast 1mid	xAI	$0.20	$1.50	$2.85	256K	fast
Qwen 3 Coder Nextmid	Together AI	$0.50	$1.20	$3.30	262K	fast
GPT-4.1 Minimid	OpenAI	$0.40	$1.60	$3.60	1.0M	fast
GPT-5 Minimid	OpenAI	$0.25	$2.00	$3.75	128K	fast
GPT-5.1 Codex Minimid	OpenAI	$0.25	$2.00	$3.75	128K	fast
Mistral Large 3 (2512)mid	Mistral	$0.50	$1.50	$3.75	128K	fast
Llama 3.3 70Bmid	Meta	$0.88	$0.88	$3.96	128K	fast
Mistral Medium 3.1flagship	Mistral	$0.40	$2.00	$4.20	131K	fast
Devstral 2 (2512)mid	Mistral	$0.40	$2.00	$4.20	131K	fast
Gemini 2.5 Flashmid	Google	$0.30	$2.50	$4.65	1.0M	fast
Amazon Nova 2 Litemid	Amazon	$0.30	$2.50	$4.65	1.0M	fast
DeepSeek R1mid	DeepSeek	$0.70	$2.50	$5.85	128K	medium
Gemini 3 Flash Previewmid	Google	$0.50	$3.00	$6.00	1.0M	fast
Qwen 3.5 397B-A17Bflagship	Together AI	$0.60	$3.60	$7.20	262K	medium
Amazon Nova Promid	Amazon	$0.80	$3.20	$7.20	300K	fast
Claude Haiku 3.5budget	Anthropic	$0.80	$4.00	$8.40	200K	fast
o3 Minimid	OpenAI	$1.10	$4.40	$9.90	200K	medium
o4 Minimid	OpenAI	$1.10	$4.40	$9.90	200K	medium
Claude Haiku 4.5budget	Anthropic	$1.00	$5.00	$10.50	200K	fast
Mistral Large 2411flagship	Mistral	$2.00	$6.00	$15.00	128K	fast
GPT-4.1flagship	OpenAI	$2.00	$8.00	$18.00	1.0M	fast
o3flagship	OpenAI	$2.00	$8.00	$18.00	200K	slow
GPT-5flagship	OpenAI	$1.25	$10.00	$18.75	128K	medium
GPT-5.1flagship	OpenAI	$1.25	$10.00	$18.75	128K	medium
Gemini 2.5 Proflagship	Google	$1.25	$10.00	$18.75	1.0M	medium
Grok 2mid	xAI	$2.00	$10.00	$21.00	33K	fast
GPT-4oflagship	OpenAI	$2.50	$10.00	$22.50	128K	fast
Command R+flagship	Cohere	$2.50	$10.00	$22.50	128K	medium
Command Aflagship	Cohere	$2.50	$10.00	$22.50	256K	medium
Gemini 3.1 Pro Previewflagship	Google	$2.00	$12.00	$24.00	1.0M	medium
GPT-5.2flagship	OpenAI	$1.75	$14.00	$26.25	128K	medium
Amazon Nova Premierflagship	Amazon	$2.50	$12.50	$26.25	1.0M	medium
Claude Sonnet 4.6mid	Anthropic	$3.00	$15.00	$31.50	200K	fast
Claude Sonnet 4.5mid	Anthropic	$3.00	$15.00	$31.50	200K	fast
Claude Sonnet 4mid	Anthropic	$3.00	$15.00	$31.50	200K	fast
Claude Sonnet 3.7mid	Anthropic	$3.00	$15.00	$31.50	200K	fast
Grok 4flagship	xAI	$3.00	$15.00	$31.50	256K	medium
Grok 3flagship	xAI	$3.00	$15.00	$31.50	131K	medium
Claude Opus 4.6flagship	Anthropic	$5.00	$25.00	$52.50	200K	medium
Claude Opus 4.5flagship	Anthropic	$5.00	$25.00	$52.50	200K	medium
o1flagship	OpenAI	$15.00	$60.00	$135.00	200K	slow
Claude Opus 4.1flagship	Anthropic	$15.00	$75.00	$157.50	200K	slow
Claude Opus 4flagship	Anthropic	$15.00	$75.00	$157.50	200K	slow
o3 Proflagship	OpenAI	$20.00	$80.00	$180.00	200K	slow
GPT-5 Proflagship	OpenAI	$15.00	$120.00	$225.00	128K	slow
GPT-5.2 Proflagship	OpenAI	$21.00	$168.00	$315.00	128K	slow

Mistral NemoBest Value

Mistralbudget

$0.120

/month

Input $/1M

$0.02

Output $/1M

$0.04

Context

128K

Latencyfast

Llama 3.2 3B

Metabudget

$0.270

/month

Input $/1M

$0.06

Output $/1M

$0.06

Context

131K

Latencyfast

GPT-OSS 20B

OpenAIbudget

$0.300

/month

Input $/1M

$0.03

Output $/1M

$0.14

Context

128K

Latencyfast

Amazon Nova Micro

Amazonbudget

$0.315

/month

Input $/1M

$0.04

Output $/1M

$0.14

Context

128K

Latencyfast

Command R7B

Coherebudget

$0.337

/month

Input $/1M

$0.04

Output $/1M

$0.15

Context

128K

Latencyfast

GPT-OSS 120B

OpenAImid

$0.402

/month

Input $/1M

$0.04

Output $/1M

$0.19

Context

128K

Latencymedium

Mistral Small 3.2

Mistralbudget

$0.450

/month

Input $/1M

$0.06

Output $/1M

$0.18

Context

32K

Latencyfast

Ministral 3 3B

Mistralbudget

$0.450

/month

Input $/1M

$0.10

Output $/1M

$0.10

Context

128K

Latencyfast

Amazon Nova Lite

Amazonbudget

$0.540

/month

Input $/1M

$0.06

Output $/1M

$0.24

Context

300K

Latencyfast

Gemini 2.0 Flash-Lite

Googlebudget

$0.675

/month

Input $/1M

$0.07

Output $/1M

$0.30

Context

1.0M

Latencyfast

Ministral 3 8B

Mistralbudget

$0.675

/month

Input $/1M

$0.15

Output $/1M

$0.15

Context

128K

Latencyfast

GPT-5 Nano

OpenAIbudget

$0.750

/month

Input $/1M

$0.05

Output $/1M

$0.40

Context

128K

Latencyfast

Llama 3.2 90B Vision

Metamid

$0.810

/month

Input $/1M

$0.18

Output $/1M

$0.18

Context

131K

Latencyfast

Llama 3.2 11B Vision

Metabudget

$0.810

/month

Input $/1M

$0.18

Output $/1M

$0.18

Context

131K

Latencyfast

GPT-4.1 Nano

OpenAIbudget

$0.900

/month

Input $/1M

$0.10

Output $/1M

$0.40

Context

1.0M

Latencyfast

Gemini 2.5 Flash-Lite

Googlebudget

$0.900

/month

Input $/1M

$0.10

Output $/1M

$0.40

Context

1.0M

Latencyfast

Gemini 2.0 Flash

Googlebudget

$0.900

/month

Input $/1M

$0.10

Output $/1M

$0.40

Context

1.0M

Latencyfast

Ministral 3 14B

Mistralbudget

$0.900

/month

Input $/1M

$0.20

Output $/1M

$0.20

Context

128K

Latencyfast

GPT-4o Mini

OpenAIbudget

$1.35

/month

Input $/1M

$0.15

Output $/1M

$0.60

Context

128K

Latencyfast

Command R

Coheremid

$1.35

/month

Input $/1M

$0.15

Output $/1M

$0.60

Context

128K

Latencyfast

Grok 4 Fast

xAIbudget

$1.35

/month

Input $/1M

$0.20

Output $/1M

$0.50

Context

2.0M

Latencyfast

Grok 4.1 Fast

xAIbudget

$1.35

/month

Input $/1M

$0.20

Output $/1M

$0.50

Context

2.0M

Latencyfast

Llama 4 Scout

Metamid

$1.43

/month

Input $/1M

$0.18

Output $/1M

$0.59

Context

328K

Latencyfast

DeepSeek V3.2

DeepSeekmid

$1.47

/month

Input $/1M

$0.28

Output $/1M

$0.42

Context

128K

Latencyfast

Qwen 3 235B-A22B

Together AImid

$1.50

/month

Input $/1M

$0.20

Output $/1M

$0.60

Context

262K

Latencyfast

DeepSeek V3.1

DeepSeekmid

$1.57

/month

Input $/1M

$0.15

Output $/1M

$0.75

Context

128K

Latencyfast

Grok 3 Mini

xAImid

$1.65

/month

Input $/1M

$0.30

Output $/1M

$0.50

Context

131K

Latencyfast

Llama 4 Maverick

Metaflagship

$2.08

/month

Input $/1M

$0.27

Output $/1M

$0.85

Context

1.0M

Latencyfast

Codestral

Mistralmid

$2.25

/month

Input $/1M

$0.30

Output $/1M

$0.90

Context

256K

Latencyfast

DeepSeek V3

DeepSeekmid

$2.29

/month

Input $/1M

$0.32

Output $/1M

$0.89

Context

128K

Latencyfast

Claude 3 Haiku

Anthropicbudget

$2.63

/month

Input $/1M

$0.25

Output $/1M

$1.25

Context

200K

Latencyfast

Grok Code Fast 1

xAImid

$2.85

/month

Input $/1M

$0.20

Output $/1M

$1.50

Context

256K

Latencyfast

Qwen 3 Coder Next

Together AImid

$3.30

/month

Input $/1M

$0.50

Output $/1M

$1.20

Context

262K

Latencyfast

GPT-4.1 Mini

OpenAImid

$3.60

/month

Input $/1M

$0.40

Output $/1M

$1.60

Context

1.0M

Latencyfast

GPT-5 Mini

OpenAImid

$3.75

/month

Input $/1M

$0.25

Output $/1M

$2.00

Context

128K

Latencyfast

GPT-5.1 Codex Mini

OpenAImid

$3.75

/month

Input $/1M

$0.25

Output $/1M

$2.00

Context

128K

Latencyfast

Mistral Large 3 (2512)

Mistralmid

$3.75

/month

Input $/1M

$0.50

Output $/1M

$1.50

Context

128K

Latencyfast

Llama 3.3 70B

Metamid

$3.96

/month

Input $/1M

$0.88

Output $/1M

$0.88

Context

128K

Latencyfast

Mistral Medium 3.1

Mistralflagship

$4.20

/month

Input $/1M

$0.40

Output $/1M

$2.00

Context

131K

Latencyfast

Devstral 2 (2512)

Mistralmid

$4.20

/month

Input $/1M

$0.40

Output $/1M

$2.00

Context

131K

Latencyfast

Gemini 2.5 Flash

Googlemid

$4.65

/month

Input $/1M

$0.30

Output $/1M

$2.50

Context

1.0M

Latencyfast

Amazon Nova 2 Lite

Amazonmid

$4.65

/month

Input $/1M

$0.30

Output $/1M

$2.50

Context

1.0M

Latencyfast

DeepSeek R1

DeepSeekmid

$5.85

/month

Input $/1M

$0.70

Output $/1M

$2.50

Context

128K

Latencymedium

Gemini 3 Flash Preview

Googlemid

$6.00

/month

Input $/1M

$0.50

Output $/1M

$3.00

Context

1.0M

Latencyfast

Qwen 3.5 397B-A17B

Together AIflagship

$7.20

/month

Input $/1M

$0.60

Output $/1M

$3.60

Context

262K

Latencymedium

Amazon Nova Pro

Amazonmid

$7.20

/month

Input $/1M

$0.80

Output $/1M

$3.20

Context

300K

Latencyfast

Claude Haiku 3.5

Anthropicbudget

$8.40

/month

Input $/1M

$0.80

Output $/1M

$4.00

Context

200K

Latencyfast

o3 Mini

OpenAImid

$9.90

/month

Input $/1M

$1.10

Output $/1M

$4.40

Context

200K

Latencymedium

o4 Mini

OpenAImid

$9.90

/month

Input $/1M

$1.10

Output $/1M

$4.40

Context

200K

Latencymedium

Claude Haiku 4.5

Anthropicbudget

$10.50

/month

Input $/1M

$1.00

Output $/1M

$5.00

Context

200K

Latencyfast

Mistral Large 2411

Mistralflagship

$15.00

/month

Input $/1M

$2.00

Output $/1M

$6.00

Context

128K

Latencyfast

GPT-4.1

OpenAIflagship

$18.00

/month

Input $/1M

$2.00

Output $/1M

$8.00

Context

1.0M

Latencyfast

OpenAIflagship

$18.00

/month

Input $/1M

$2.00

Output $/1M

$8.00

Context

200K

Latencyslow

GPT-5

OpenAIflagship

$18.75

/month

Input $/1M

$1.25

Output $/1M

$10.00

Context

128K

Latencymedium

GPT-5.1

OpenAIflagship

$18.75

/month

Input $/1M

$1.25

Output $/1M

$10.00

Context

128K

Latencymedium

Gemini 2.5 Pro

Googleflagship

$18.75

/month

Input $/1M

$1.25

Output $/1M

$10.00

Context

1.0M

Latencymedium

Grok 2

xAImid

$21.00

/month

Input $/1M

$2.00

Output $/1M

$10.00

Context

33K

Latencyfast

GPT-4o

OpenAIflagship

$22.50

/month

Input $/1M

$2.50

Output $/1M

$10.00

Context

128K

Latencyfast

Command R+

Cohereflagship

$22.50

/month

Input $/1M

$2.50

Output $/1M

$10.00

Context

128K

Latencymedium

Command A

Cohereflagship

$22.50

/month

Input $/1M

$2.50

Output $/1M

$10.00

Context

256K

Latencymedium

Gemini 3.1 Pro Preview

Googleflagship

$24.00

/month

Input $/1M

$2.00

Output $/1M

$12.00

Context

1.0M

Latencymedium

GPT-5.2

OpenAIflagship

$26.25

/month

Input $/1M

$1.75

Output $/1M

$14.00

Context

128K

Latencymedium

Amazon Nova Premier

Amazonflagship

$26.25

/month

Input $/1M

$2.50

Output $/1M

$12.50

Context

1.0M

Latencymedium

Claude Sonnet 4.6

Anthropicmid

$31.50

/month

Input $/1M

$3.00

Output $/1M

$15.00

Context

200K

Latencyfast

Claude Sonnet 4.5

Anthropicmid

$31.50

/month

Input $/1M

$3.00

Output $/1M

$15.00

Context

200K

Latencyfast

Claude Sonnet 4

Anthropicmid

$31.50

/month

Input $/1M

$3.00

Output $/1M

$15.00

Context

200K

Latencyfast

Claude Sonnet 3.7

Anthropicmid

$31.50

/month

Input $/1M

$3.00

Output $/1M

$15.00

Context

200K

Latencyfast

Grok 4

xAIflagship

$31.50

/month

Input $/1M

$3.00

Output $/1M

$15.00

Context

256K

Latencymedium

Grok 3

xAIflagship

$31.50

/month

Input $/1M

$3.00

Output $/1M

$15.00

Context

131K

Latencymedium

Claude Opus 4.6

Anthropicflagship

$52.50

/month

Input $/1M

$5.00

Output $/1M

$25.00

Context

200K

Latencymedium

Claude Opus 4.5

Anthropicflagship

$52.50

/month

Input $/1M

$5.00

Output $/1M

$25.00

Context

200K

Latencymedium

OpenAIflagship

$135.00

/month

Input $/1M

$15.00

Output $/1M

$60.00

Context

200K

Latencyslow

Claude Opus 4.1

Anthropicflagship

$157.50

/month

Input $/1M

$15.00

Output $/1M

$75.00

Context

200K

Latencyslow

Claude Opus 4

Anthropicflagship

$157.50

/month

Input $/1M

$15.00

Output $/1M

$75.00

Context

200K

Latencyslow

o3 Pro

OpenAIflagship

$180.00

/month

Input $/1M

$20.00

Output $/1M

$80.00

Context

200K

Latencyslow

GPT-5 Pro

OpenAIflagship

$225.00

/month

Input $/1M

$15.00

Output $/1M

$120.00

Context

128K

Latencyslow

GPT-5.2 Pro

OpenAIflagship

$315.00

/month

Input $/1M

$21.00

Output $/1M

$168.00

Context

128K

Latencyslow

Providers Covered

Pricing data sourced directly from official documentation and verified monthly.

OpenAI Anthropic GoogleDeepSeekMistralCohereMetaxAIAmazonTogether AI

Understanding AI API Pricing in 2026

As large language models become integral to software products, understanding the cost of AI APIs is critical for engineering teams and product managers. Every major provider — OpenAI, Anthropic, Google, Meta, Mistral, Cohere, DeepSeek, xAI, and Amazon — prices their APIs per token, but the rates vary dramatically depending on model capability, latency, and whether you use standard or batch endpoints.

Token-based pricing means you pay separately for the text you send to the model (input tokens) and the text it generates (output tokens). Output tokens are typically 2 to 5 times more expensive than input tokens because they require more computation. Batch APIs, offered by providers like OpenAI and Anthropic, let you queue requests for asynchronous processing at roughly half the standard rate — ideal for offline workloads such as data labeling, summarization pipelines, and evaluation runs.

Choosing the right model requires balancing cost against quality, latency, and feature support. A chatbot handling millions of short messages needs a different model than a coding assistant working with long context windows. Batch processing can cut costs by 50% for non-real-time workloads, and prompt caching further reduces input token costs for providers that support it. Read our complete guide to AI API pricing in 2026 for a deep dive into how token pricing works across every provider.

For most production applications, the best strategy is to route different tasks to different models: use a flagship model like Claude Opus 4, GPT-4.1, or Gemini 2.5 Pro for complex reasoning, and a budget model like GPT-4.1 Nano, Gemini 2.0 Flash, or Amazon Nova Micro for simpler classification or extraction tasks. This tiered approach can reduce your monthly API bill by 80% or more without sacrificing quality where it matters. Explore our 5 proven strategies to reduce AI API costs to learn more.

Frequently Asked Questions

How is the monthly cost calculated?+

Monthly cost is calculated as: (input tokens per request x requests per day x 30 days / 1,000,000 x input rate) + (output tokens per request x requests per day x 30 days / 1,000,000 x output rate). When batch pricing is enabled, the batch input and output rates are used instead of standard rates for models that support it. This gives you a realistic estimate of what your monthly bill will look like at steady-state usage.

What is the difference between input and output tokens?+

Input tokens are the tokens in the text you send to the model -- this includes your system prompt, user message, and any context you provide. Output tokens are the tokens the model generates in its response. Output tokens are more expensive because they require sequential computation (each token depends on the previous one), while input tokens can be processed in parallel. A typical English word is roughly 1.3 tokens.

What is batch pricing and when should I use it?+

Batch pricing lets you submit a collection of API requests that are processed asynchronously, typically within a 24-hour window. In return, you get a significant discount -- usually 50% off standard pricing. This is ideal for workloads that do not require real-time responses, such as bulk document processing, dataset annotation, evaluation benchmarks, and nightly content generation pipelines. Not all providers offer batch endpoints; OpenAI and Anthropic are the most prominent supporters.

How often is the pricing data updated?+

We verify pricing data directly from each provider's official pricing page on a regular basis. AI API pricing changes frequently -- providers often lower prices as they optimize infrastructure, and new models launch with different price points. Each model entry includes the date it was last verified. If you notice a discrepancy, please let us know so we can update it promptly.

Which model should I choose for my project?+

It depends on your priorities. If you need the highest quality reasoning and can afford it, flagship models like Claude Opus 4, o3, or Gemini 2.5 Pro deliver the best results. For latency- sensitive applications (chatbots, autocomplete), look at fast-tier models like GPT-4.1 Mini, Claude Sonnet 4, or Gemini 2.5 Flash. For high-volume, cost-sensitive workloads where quality can be slightly lower, budget models like GPT-4.1 Nano, Amazon Nova Micro, or Mistral Small can cut costs by 90% or more. Many teams use a routing strategy that sends easy tasks to cheap models and hard tasks to premium ones.

Last updated: February 2026