Claude Opus 4.6$11.000/MClaude Opus 4.5$33.000/MClaude Sonnet 3.7$6.600/MClaude Opus 3$33.000/MClaude 2.1$12.800/MClaude 2$12.800/MGPT-5$3.875/MGPT-4.5$97.500/MGPT-4 Turbo Preview$16.000/MGPT-4$39.000/MGPT-4-32k$78.000/Mo3$19.000/Mo3-mini$2.090/Mo4-mini$2.090/Mo1$28.500/Mo1-mini$5.700/Mo1-preview$28.500/MGemini 2.5 Pro$3.875/MGemini 1.5 Pro$2.375/MGemini 1.0 Ultra$12.000/MGemini 1.0 Pro$0.800/MPaLM 2 Bison$0.500/MPaLM 2 Unicorn$5.000/MGemma 3 27B$0.270/MGrok 3$6.600/MGrok 2$4.400/MGrok 1.5$8.000/MDeepSeek-V3$0.519/MDeepSeek-V3-0324$0.519/MDeepSeek-R1$1.042/MClaude Opus 4.6$11.000/MClaude Opus 4.5$33.000/MClaude Sonnet 3.7$6.600/MClaude Opus 3$33.000/MClaude 2.1$12.800/MClaude 2$12.800/MGPT-5$3.875/MGPT-4.5$97.500/MGPT-4 Turbo Preview$16.000/MGPT-4$39.000/MGPT-4-32k$78.000/Mo3$19.000/Mo3-mini$2.090/Mo4-mini$2.090/Mo1$28.500/Mo1-mini$5.700/Mo1-preview$28.500/MGemini 2.5 Pro$3.875/MGemini 1.5 Pro$2.375/MGemini 1.0 Ultra$12.000/MGemini 1.0 Pro$0.800/MPaLM 2 Bison$0.500/MPaLM 2 Unicorn$5.000/MGemma 3 27B$0.270/MGrok 3$6.600/MGrok 2$4.400/MGrok 1.5$8.000/MDeepSeek-V3$0.519/MDeepSeek-V3-0324$0.519/MDeepSeek-R1$1.042/M
LIVE
TOTAL MODELS
291
39 providers
CHEAPEST BLEND
$0.0036/M
Whisper Large v3
PRICIEST BLEND
$97.50/M
GPT-4.5
AVG BLENDED
$2.823/M
per 1M tokens
LOWEST INPUT
$0.0036/M
best price for input
LARGEST CONTEXT
2M
Gemini 1.5 Pro
PROVIDER
CATEGORY
$/M/$/K
PROVIDERMODELCATEGORYINPUTOUTPUTCONTEXTBLENDED / 1MNOTES
Replicate
Whisper Large v3
Audio
MULTIMODAL$0.00360
$0.00360
Speech transcription
OpenAI
text-embedding-3-small
Embedding
EMBEDDING$0.02008K
$0.0200
1536 dimensions
Google
Gemma 2 2B
Nano
EFFICIENT$0.0200$0.02008K
$0.0200
Smallest Gemma
Amazon
Titan Embeddings V2
Embedding
EMBEDDING$0.02008K
$0.0200
1536 dimensions
Google
text-embedding-004
Embedding
EMBEDDING$0.02502K
$0.0250
768 dimensions
Google
text-embedding-gecko-003
Legacy Embed
EMBEDDING$0.02503K
$0.0250
Legacy embedding
Qwen
Qwen2.5-1.5B-Instruct
Nano
EFFICIENT$0.0300$0.030032K
$0.0300
Tiny Qwen
Meta
Llama 3.2 1B
Nano
EFFICIENT$0.0400$0.0400128K
$0.0400
Smallest Llama
Replicate
SDXL (Rep)
Image Gen
MULTIMODAL$0.0400
$0.0400
Image generation
Nvidia
Phi-3-Mini-4K (NIM)
Nano
EFFICIENT$0.0400$0.04004K
$0.0400
Tiny NIM
Qwen
Qwen2.5-3B-Instruct
Micro
EFFICIENT$0.0500$0.050032K
$0.0500
3B micro
MiniMax
ABAB 5.5c
Compact
EFFICIENT$0.0500$0.050016K
$0.0500
Chat-optimized
IBM
Granite 3.1 2B Instruct
Nano
EFFICIENT$0.0300$0.1000128K
$0.0510
Tiny Granite
ByteDance
Doubao-Lite-128k
Fast
EFFICIENT$0.0400$0.0900128K
$0.0550
Fast 128K
ByteDance
Doubao-Lite-32k
Budget
EFFICIENT$0.0400$0.090032K
$0.0550
Budget tier
Groq
Llama 3.1 8B (Groq)
Fast Nano
EFFICIENT$0.0500$0.0800128K
$0.0590
Fastest 8B
Meta
Llama 3.2 3B
Micro
EFFICIENT$0.0600$0.0600128K
$0.0600
Ultra-compact
Mistral
Mistral 7B v0.1
Legacy
EFFICIENT$0.0600$0.06008K
$0.0600
Original Mistral 7B
Groq
Llama 3.2 3B (Groq)
Fast Micro
EFFICIENT$0.0600$0.0600128K
$0.0600
Micro on LPU
DeepInfra
Gemma 2 9B (DI)
Compact
EFFICIENT$0.0600$0.06008K
$0.0600
Cheapest Gemma
Voyage AI
voyage-3-lite
Efficient
EMBEDDING$0.065032K
$0.0650
Speed-optimized
Amazon
Nova Micro
Micro
EFFICIENT$0.0350$0.1400128K
$0.0665
Text-only micro
DeepInfra
Mistral 7B (DI)
Nano
EFFICIENT$0.0700$0.070032K
$0.0700
Cheapest Mistral
Lepton AI
Mistral 7B (Lepton)
Nano
EFFICIENT$0.0700$0.070032K
$0.0700
Cheapest at Lepton
Microsoft
Phi-2
Legacy
EFFICIENT$0.0700$0.07002K
$0.0700
Original Phi 2.7B
Google
Gemini 1.5 Flash-8B
Nano
EFFICIENT$0.0375$0.15001M
$0.0712
Smallest Gemini
Cohere
Command R7B
Compact
EFFICIENT$0.0375$0.1500128K
$0.0712
7B model
Microsoft
Phi-4 Mini
Tiny
REASONING$0.0400$0.1600128K
$0.0760
3.8B mini reasoning
Mistral
Mistral 7B v0.3
Open
EFFICIENT$0.0800$0.080032K
$0.0800
Popular open model
Amazon
Nova Canvas
Image Gen
MULTIMODAL$0.0800
$0.0800
AI image generation
Google
Gemma 3 9B
Open
EFFICIENT$0.0900$0.0900128K
$0.0900
Gemma 3 compact
Google
Gemma 2 9B
Open
EFFICIENT$0.0900$0.09008K
$0.0900
Compact Gemma 2
OpenAI
text-embedding-ada-002
Legacy Embed
EMBEDDING$0.10008K
$0.1000
Classic embedding
Meta
Llama 3.1 8B
Compact
EFFICIENT$0.1000$0.1000128K
$0.1000
Compact open
Mistral
mistral-embed
Embedding
EMBEDDING$0.10008K
$0.1000
1024 dimensions
Cohere
Embed v3 English
Embedding
EMBEDDING$0.1000512
$0.1000
1024 dimensions
Cohere
Embed v3 Multilingual
Embedding
EMBEDDING$0.1000512
$0.1000
100+ languages
Together AI
Llama 3.1 8B (Together)
Compact
EFFICIENT$0.1000$0.1000128K
$0.1000
Budget inference
Fireworks AI
Llama 3.1 8B (fw)
Nano
EFFICIENT$0.1000$0.1000131K
$0.1000
Cheapest 8B
Cerebras
Llama-3.1-8B (Cerebras)
Ultra-Fast Nano
EFFICIENT$0.1000$0.1000128K
$0.1000
Fastest 8B ever
Upstage
Solar Embedding
Embedding
EMBEDDING$0.10004K
$0.1000
Solar embeddings
Qwen
Qwen2.5-7B-Instruct
Compact
EFFICIENT$0.1000$0.1000128K
$0.1000
7B compact
Qwen
Qwen2.5-Coder-7B
Code Nano
EFFICIENT$0.1000$0.1000128K
$0.1000
Small code model
Qwen
Qwen1.5-7B-Chat
Legacy Compact
EFFICIENT$0.1000$0.100032K
$0.1000
Qwen 1.5 7B
Stability AI
StableLM 2 1.6B
Nano
EFFICIENT$0.1000$0.10004K
$0.1000
Tiny language model
Stability AI
StableCode 3B
Code Nano
EFFICIENT$0.1000$0.100016K
$0.1000
Tiny code model
DeepSeek
DeepSeek-R1-Distill-8B
Distilled
REASONING$0.0700$0.2000128K
$0.1090
8B distilled
Replicate
Llama 3 8B (Rep)
Compact
EFFICIENT$0.0500$0.25008K
$0.1100
Budget option
IBM
Granite 3.1 8B Instruct
Compact
EFFICIENT$0.0500$0.2500128K
$0.1100
Enterprise LLM
IBM
Granite 3.0 8B Dense
Previous Gen
EFFICIENT$0.0500$0.25004K
$0.1100
Dense architecture
Amazon
Nova Lite
Efficient
EFFICIENT$0.0600$0.2400300K
$0.1140
Ultra low-cost
Voyage AI
voyage-3
Balanced
EMBEDDING$0.120032K
$0.1200
General purpose
Voyage AI
voyage-finance-2
Finance
EMBEDDING$0.120032K
$0.1200
Finance domain
Voyage AI
voyage-law-2
Legal
EMBEDDING$0.120032K
$0.1200
Legal domain
Voyage AI
voyage-multilingual-2
Multilingual
EMBEDDING$0.120032K
$0.1200
100+ languages
Moonshot AI
Moonshot v1 8k
Compact
EFFICIENT$0.1200$0.12008K
$0.1200
Fast response
OpenAI
text-embedding-3-large
Embedding
EMBEDDING$0.13008K
$0.1300
3072 dimensions
Microsoft
Phi-4
Frontier SLM
REASONING$0.0700$0.280016K
$0.1330
14B reasoning SLM
01.AI
Yi-Lightning
Fast
EFFICIENT$0.1400$0.140016K
$0.1400
Ultra-fast Yi
Zhipu AI
GLM-4-Air
Efficient
EFFICIENT$0.1400$0.1400128K
$0.1400
Affordable GLM
Google
Gemini 2.0 Flash Lite
Micro
EFFICIENT$0.0750$0.30001M
$0.1425
Lowest cost Gemini
Google
Gemini 1.5 Flash
Previous Gen
EFFICIENT$0.0750$0.30001M
$0.1425
Previous Flash
Mistral
Pixtral 12B
Compact Vision
MULTIMODAL$0.1500$0.1500128K
$0.1500
Vision-language 12B
Mistral
Mistral Nemo
Open
EFFICIENT$0.1500$0.1500128K
$0.1500
12B open model
Anyscale
Mistral 7B (Anyscale)
Nano
EFFICIENT$0.1500$0.150032K
$0.1500
Anyscale serverless
Nvidia
Mistral-NeMo-12B (NIM)
Efficient
EFFICIENT$0.1500$0.1500128K
$0.1500
NIM deployment
01.AI
Yi-6B
Nano
EFFICIENT$0.1500$0.15004K
$0.1500
Original Yi 6B
Meta
Llama 3.2 11B Vision
Compact Vision
MULTIMODAL$0.1600$0.1600128K
$0.1600
Compact vision
Mistral
Mistral Small 3.2
Efficient
EFFICIENT$0.1000$0.3000128K
$0.1600
Cost leader
ByteDance
Doubao-Pro-32k
Balanced
EFFICIENT$0.1100$0.280032K
$0.1610
Doubao standard
Amazon
Titan Text Lite
Lightweight
EFFICIENT$0.1500$0.20004K
$0.1650
Low-latency Titan
DeepSeek
DeepSeek-R1-Distill-14B
Distilled
REASONING$0.1000$0.3500128K
$0.1750
14B distilled
Meta
Llama 4 Scout
Efficient
EFFICIENT$0.1000$0.3500512K
$0.1750
Long-context MoE
Meta
Llama 2 7B Chat
Legacy
EFFICIENT$0.1800$0.18004K
$0.1800
Compact Llama 2
Meta
CodeLlama 7B Instruct
Code
EFFICIENT$0.1800$0.180016K
$0.1800
Compact code
Groq
Llama 3.2 11B Vision (Groq)
Fast Vision
MULTIMODAL$0.1800$0.1800128K
$0.1800
Vision on LPU
Voyage AI
voyage-3-large
Flagship
EMBEDDING$0.180032K
$0.1800
Best retrieval
Voyage AI
voyage-code-3
Code
EMBEDDING$0.180032K
$0.1800
Code retrieval
TII / Falcon
Falcon 7B Instruct
Nano
EFFICIENT$0.1800$0.18002K
$0.1800
Original Falcon
DeepSeek
DeepSeek-Coder-V2
Code
EFFICIENT$0.1400$0.2800128K
$0.1820
Code specialist
OpenAI
GPT-4.1 nano
Efficient
EFFICIENT$0.1000$0.40001M
$0.1900
Cheapest OpenAI model
Google
Gemini 2.0 Flash
Efficient
EFFICIENT$0.1000$0.40001M
$0.1900
Next-gen Flash
Meta
Llama 3 8B
Previous Gen
EFFICIENT$0.2000$0.20008K
$0.2000
Llama 3 compact
Groq
Gemma 2 9B (Groq)
Fast Compact
EFFICIENT$0.2000$0.20008K
$0.2000
Google Gemma fast
Fireworks AI
Mixtral 8x7B (fw)
Serverless
EFFICIENT$0.2000$0.200032K
$0.2000
MoE serverless
TII / Falcon
Falcon 11B
Compact
EFFICIENT$0.2000$0.20008K
$0.2000
New Falcon 11B
Meta
Llama 2 13B Chat
Legacy
EFFICIENT$0.2200$0.22004K
$0.2200
Mid Llama 2
Meta
CodeLlama 13B Instruct
Code
EFFICIENT$0.2200$0.220016K
$0.2200
Mid code model
Mistral
Mixtral 8x7B v0.1
Open MoE
EFFICIENT$0.2400$0.240032K
$0.2400
Classic MoE
Groq
Mixtral 8x7B (Groq)
Fast MoE
EFFICIENT$0.2400$0.240032K
$0.2400
MoE on LPU
DeepInfra
Mixtral 8x7B (DI)
MoE
EFFICIENT$0.2400$0.240032K
$0.2400
MoE managed
Microsoft
Phi-3.5 Mini
Compact
EFFICIENT$0.1300$0.5200128K
$0.2470
3.8B long-context
Microsoft
Phi-3 Mini
Tiny
EFFICIENT$0.1300$0.5200128K
$0.2470
3.8B mini
Mistral
Codestral Mamba
Code Mamba
EFFICIENT$0.2500$0.2500256K
$0.2500
State-space code
AI21 Labs
Jamba 1.5 Mini
Efficient
EFFICIENT$0.2000$0.4000256K
$0.2600
Compact SSM hybrid
Google
Gemma 3 27B
Open
FRONTIER$0.2700$0.2700128K
$0.2700
Largest Gemma 3
Google
Gemma 2 27B
Open
EFFICIENT$0.2700$0.27008K
$0.2700
Open weights
Meta
Llama 3.3 70B
Open
EFFICIENT$0.2300$0.4000128K
$0.2810
Strong open weights
DeepInfra
Llama 3.3 70B (DI)
Serverless
EFFICIENT$0.2300$0.4000128K
$0.2810
Cost-effective
OpenAI
GPT-4o mini
Efficient
EFFICIENT$0.1500$0.6000128K
$0.2850
Cost-optimized
Google
Gemini 2.5 Flash
Balanced
MULTIMODAL$0.1500$0.60001M
$0.2850
Speed-optimized
Cohere
Command R
Balanced
EFFICIENT$0.1500$0.6000128K
$0.2850
RAG-optimized
Upstage
Solar Mini
Efficient
EFFICIENT$0.1500$0.600032K
$0.2850
Efficient Solar
Microsoft
Phi-3 Small
Compact
EFFICIENT$0.1500$0.6000128K
$0.2850
7B small model
DeepSeek
DeepSeek-R1-Distill-32B
Distilled
REASONING$0.2000$0.5000128K
$0.2900
32B distilled
Qwen
Qwen2-VL-7B
Vision
MULTIMODAL$0.3000$0.300032K
$0.3000
7B vision-language
01.AI
Yi-9B
Compact
EFFICIENT$0.3000$0.30004K
$0.3000
Yi 9B model
01.AI
Yi-VL-6B
Vision
MULTIMODAL$0.3000$0.30004K
$0.3000
Yi Vision 6B
Stability AI
StableLM 2 12B
Open
EFFICIENT$0.3000$0.30004K
$0.3000
Stability text model
Microsoft
Phi-3.5 MoE
MoE
EFFICIENT$0.1600$0.6400128K
$0.3040
16x3.8B mixture
Meta
Llama 4 Maverick
Frontier
FRONTIER$0.2000$0.6000128K
$0.3200
MoE architecture
Mistral
Codestral
Code
EFFICIENT$0.2000$0.6000256K
$0.3200
Code specialist
Microsoft
Phi-3 Medium
Balanced
EFFICIENT$0.1700$0.6800128K
$0.3230
14B medium
MiniMax
ABAB 5.5s
Efficient
EFFICIENT$0.1500$0.750016K
$0.3300
Cost-effective
Meta
CodeLlama 34B Instruct
Code
EFFICIENT$0.3500$0.350016K
$0.3500
Strong code model
Qwen
Qwen2.5-14B-Instruct
Small
EFFICIENT$0.3500$0.3500128K
$0.3500
14B model
Qwen
Qwen2.5-Coder-14B
Code
EFFICIENT$0.3500$0.3500128K
$0.3500
14B code model
Qwen
Qwen1.5-14B-Chat
Legacy
EFFICIENT$0.3500$0.350032K
$0.3500
Qwen 1.5 14B
xAI
Grok 3 mini
Efficient
EFFICIENT$0.3000$0.5000131K
$0.3600
Cost-efficient
Meta
Llama 3.1 70B
Balanced
EFFICIENT$0.3500$0.4000128K
$0.3650
Balanced open
DeepInfra
Llama 3.1 70B (DI)
Balanced
EFFICIENT$0.3500$0.4000128K
$0.3650
70B managed
DeepInfra
Qwen2.5-72B (DI)
Balanced
FRONTIER$0.3500$0.4000128K
$0.3650
Qwen on DeepInfra
Nvidia
Llama-3.1-Nemotron-70B
RLHF-Aligned
FRONTIER$0.3500$0.4000128K
$0.3650
RLHF-tuned 70B
01.AI
Yi-Large-Turbo
Balanced
EFFICIENT$0.3800$0.380016K
$0.3800
Turbo variant
Cohere
Command Light
Legacy
EFFICIENT$0.3000$0.60004K
$0.3900
Older Command
Hyperbolic
Llama 3.1 70B (Hyp)
Fast
EFFICIENT$0.4000$0.4000128K
$0.4000
70B fast inference
Hyperbolic
Hermes-3-70B (Hyp)
Fine-tuned
FRONTIER$0.4000$0.4000128K
$0.4000
Nous Hermes 3
Hyperbolic
Qwen2.5-72B (Hyp)
Fast
FRONTIER$0.4000$0.4000128K
$0.4000
Qwen on Hyperbolic
IBM
Granite 20B Multilingual
Multilingual
EFFICIENT$0.4000$0.40008K
$0.4000
10+ languages
Moonshot AI
Moonshot v1 32k
Balanced
EFFICIENT$0.4000$0.400032K
$0.4000
Standard context
MiniMax
ABAB 6.5t
Turbo
EFFICIENT$0.4500$0.45008K
$0.4500
Turbo variant
Tencent
Hunyuan-Standard
Balanced
EFFICIENT$0.3500$0.7000256K
$0.4550
Standard model
Google
PaLM 2 Bison
Legacy
FRONTIER$0.5000$0.50008K
$0.5000
PaLM 2 text
Anyscale
Mixtral 8x7B (Anyscale)
MoE
EFFICIENT$0.5000$0.500032K
$0.5000
MoE on Anyscale
Lepton AI
Mixtral 8x7B (Lepton)
MoE
EFFICIENT$0.5000$0.500032K
$0.5000
MoE on Lepton
DeepSeek
DeepSeek-R1-Distill-70B
Distilled
REASONING$0.3500$0.8800128K
$0.5090
Distilled reasoning
Replicate
Mixtral 8x7B (Rep)
Serverless
EFFICIENT$0.3000$1.0032K
$0.5100
MoE on Replicate
DeepSeek
DeepSeek-V3
Frontier
FRONTIER$0.2700$1.1064K
$0.5190
Top open source
DeepSeek
DeepSeek-V3-0324
Frontier
FRONTIER$0.2700$1.10128K
$0.5190
Extended context
Anthropic
Claude Haiku 3
Classic
EFFICIENT$0.2500$1.25200K
$0.5500
Fastest Claude 3
AI21 Labs
Jamba Instruct
Previous
FRONTIER$0.5000$0.7000256K
$0.5600
First Jamba model
IBM
Granite Code 34B
Code
EFFICIENT$0.3500$1.058K
$0.5600
Enterprise code
DeepInfra
Phind-CodeLlama-34B (DI)
Code
EFFICIENT$0.6000$0.600016K
$0.6000
Code model
DeepInfra
Yi-34B-Chat (DI)
Open
FRONTIER$0.6000$0.60004K
$0.6000
Yi on DeepInfra
DeepInfra
WizardLM-2 8x22B (DI)
Fine-tuned
FRONTIER$0.6300$0.630064K
$0.6300
Instruction tuned
Meta
Llama 3 70B
Previous Gen
EFFICIENT$0.5900$0.79008K
$0.6500
Llama 3 base
Groq
Llama 3.3 70B (Groq)
Ultra-Fast
EFFICIENT$0.5900$0.7900128K
$0.6500
Fastest 70B
Groq
Llama 3.1 70B (Groq)
Fast
EFFICIENT$0.5900$0.7900128K
$0.6500
70B on LPU
Qwen
Qwen2-57B-A14B
MoE
FRONTIER$0.6500$0.650064K
$0.6500
57B MoE model
Cerebras
DeepSeek-R1 (Cerebras)
Fast Reasoning
REASONING$0.5500$0.990064K
$0.6820
R1 on Cerebras
Qwen
Qwen2.5-32B-Instruct
Balanced
EFFICIENT$0.7000$0.7000128K
$0.7000
32B compact
Qwen
Qwen2.5-Coder-32B
Code
EFFICIENT$0.7000$0.7000128K
$0.7000
Best open code model
Qwen
Qwen1.5-32B-Chat
Legacy
EFFICIENT$0.7000$0.700032K
$0.7000
Qwen 1.5 32B
Zhipu AI
GLM-4-Long
Long Context
FRONTIER$0.7000$0.70001M
$0.7000
1M context GLM
Zhipu AI
GLM-3-Turbo
Legacy
EFFICIENT$0.7000$0.7000128K
$0.7000
Previous gen
Cerebras
Llama-3.3-70B (Cerebras)
Ultra-Fast
EFFICIENT$0.5900$0.9900128K
$0.7100
Fastest 70B inference
Baidu
ERNIE 3.5 8K
Balanced
EFFICIENT$0.5500$1.108K
$0.7150
Cost-efficient
OpenAI
GPT-4.1 mini
Efficient
EFFICIENT$0.4000$1.601M
$0.7600
Fast & affordable, 1M ctx
Groq
Qwen2.5-72B (Groq)
Fast
FRONTIER$0.7900$0.7900128K
$0.7900
Qwen on LPU
Groq
Qwen2.5-Coder-32B (Groq)
Fast Code
EFFICIENT$0.7900$0.7900128K
$0.7900
Code on LPU
OpenAI
GPT-3.5 Turbo
Legacy
EFFICIENT$0.5000$1.5016K
$0.8000
Affordable classic
Google
Gemini 1.0 Pro
Legacy
FRONTIER$0.5000$1.5032K
$0.8000
Original Pro
Meta
Llama 3.1 405B
Large
FRONTIER$0.8000$0.8000128K
$0.8000
Largest open model
Amazon
Titan Text Premier
Amazon
EFFICIENT$0.5000$1.5032K
$0.8000
Amazon-built
Cohere
Aya Expanse 32B
Multilingual
FRONTIER$0.5000$1.50128K
$0.8000
100+ languages
Cohere
Aya Expanse 8B
Compact ML
EFFICIENT$0.5000$1.508K
$0.8000
Compact multilingual
Together AI
Nous-Hermes-2-Yi-34B
Fine-tuned
FRONTIER$0.8000$0.80004K
$0.8000
Nous Hermes fine-tune
Fireworks AI
Phind-CodeLlama-34B (fw)
Code
EFFICIENT$0.8000$0.800016K
$0.8000
Code specialist
DeepInfra
Llama 3.1 405B (DI)
Serverless
FRONTIER$0.8000$0.8000128K
$0.8000
405B serverless
Lepton AI
Llama 3.1 70B (Lepton)
Serverless
EFFICIENT$0.8000$0.8000128K
$0.8000
Lepton inference
01.AI
Yi-34B-Chat
Open Large
FRONTIER$0.8000$0.80004K
$0.8000
34B chat model
01.AI
Yi-34B
Open Base
FRONTIER$0.8000$0.80004K
$0.8000
34B base model
ByteDance
Doubao-Pro-256k
Long Context
FRONTIER$0.5600$1.40256K
$0.8120
256K context
ByteDance
Doubao-Pro-128k
Frontier
FRONTIER$0.5600$1.40128K
$0.8120
128K context
Groq
DeepSeek-R1 (Groq)
Fast Reasoning
REASONING$0.7500$0.9900128K
$0.8220
R1 on LPU
Reka AI
Reka Edge
Micro
EFFICIENT$0.4000$2.00128K
$0.8800
On-device capable
Meta
Llama 3.2 90B Vision
Vision
MULTIMODAL$0.8800$0.8800128K
$0.8800
Vision + language
Together AI
Llama 3.3 70B (Together)
Managed
EFFICIENT$0.8800$0.8800128K
$0.8800
Managed inference
Meta
Llama 2 70B Chat
Legacy
EFFICIENT$0.9000$0.90004K
$0.9000
Classic Llama 2
Meta
CodeLlama 70B Instruct
Code
EFFICIENT$0.9000$0.900016K
$0.9000
Largest CodeLlama
Mistral
Mixtral 8x22B
Open MoE
EFFICIENT$0.9000$0.900064K
$0.9000
Largest open MoE
Together AI
Falcon 40B (Together)
Open
EFFICIENT$0.9000$0.90002K
$0.9000
Falcon 40B
Fireworks AI
Qwen2.5-72B (fw)
Balanced
FRONTIER$0.9000$0.900032K
$0.9000
Qwen serverless
Fireworks AI
Yi-34B (fw)
Open
FRONTIER$0.9000$0.90004K
$0.9000
Yi on Fireworks
Qwen
Qwen2-72B-Instruct
Previous Gen
FRONTIER$0.9000$0.9000128K
$0.9000
Qwen 2 flagship
Qwen
Qwen1.5-72B-Chat
Legacy
FRONTIER$0.9000$0.900032K
$0.9000
Qwen 1.5 72B
TII / Falcon
Falcon 40B Instruct
Open
EFFICIENT$0.9000$0.90002K
$0.9000
Instruction tuned
Tencent
Hunyuan-Standard-256K
Long Context
FRONTIER$0.7000$1.40256K
$0.9100
256K context
Writer
Palmyra X 004
Frontier
FRONTIER$0.5000$2.00128K
$0.9500
Enterprise-grade
Cerebras
Llama-3.1-405B (Cerebras)
Fast Large
FRONTIER$0.9900$0.9900128K
$0.9900
405B wafer-scale
Perplexity
Sonar
Efficient
EFFICIENT$1.00$1.00127K
$1.00
Grounded search
Anyscale
Llama 3.1 70B (Anyscale)
Managed
EFFICIENT$1.00$1.00128K
$1.00
Ray-based inference
Anyscale
CodeLlama 70B (Anyscale)
Code
EFFICIENT$1.00$1.0016K
$1.00
Code-tuned Llama
DeepSeek
DeepSeek-R1
Reasoning
REASONING$0.5500$2.1964K
$1.04
Chain-of-thought
DeepSeek
DeepSeek-R1-0528
Reasoning
REASONING$0.5500$2.1964K
$1.04
Latest R1
DeepInfra
DeepSeek-R1 (DI)
Reasoning
REASONING$0.5500$2.1964K
$1.04
R1 managed
Qwen
QwQ-32B
Reasoning
REASONING$0.6000$2.40131K
$1.14
Chain-of-thought
Databricks
DBRX Instruct
Open
FRONTIER$0.7500$2.2532K
$1.20
132B open MoE
Databricks
DBRX Base
Base
FRONTIER$0.7500$2.2532K
$1.20
DBRX base model
Together AI
Qwen2.5-72B (Together)
Managed
FRONTIER$1.20$1.2032K
$1.20
Qwen managed
Together AI
DBRX Instruct (Together)
Enterprise
FRONTIER$1.20$1.2032K
$1.20
132B MoE
Together AI
Mixtral 8x22B (Together)
Large MoE
FRONTIER$1.20$1.2064K
$1.20
Largest Mixtral
Together AI
WizardLM-2 8x22B
Fine-tuned
FRONTIER$1.20$1.2064K
$1.20
Instruction tuned
Qwen
Qwen2.5-72B-Instruct
Large
FRONTIER$1.20$1.20128K
$1.20
72B open weights
Anthropic
Claude Instant 1.2
Legacy
EFFICIENT$0.8000$2.40100K
$1.28
Legacy fast model
MiniMax
MiniMax-Text-01
Frontier
FRONTIER$0.7000$2.801M
$1.33
1M context window
Amazon
Nova Pro
Frontier
MULTIMODAL$0.8000$3.20300K
$1.52
Video understanding
Nvidia
Llama-3.1-Nemotron-Ultra-253B
Ultra
FRONTIER$1.60$1.60128K
$1.60
253B Nemotron
Writer
Palmyra X 32k
Balanced
FRONTIER$1.00$3.0032K
$1.60
Business writing
Writer
Palmyra Med
Medical
FRONTIER$1.00$3.0032K
$1.60
Medical domain
Writer
Palmyra Fin
Financial
FRONTIER$1.00$3.0032K
$1.60
Finance domain
Databricks
Llama 3.1 70B (DB)
Managed
EFFICIENT$1.00$3.00128K
$1.60
Managed on Databricks
OpenAI
GPT-3.5 Turbo Instruct
Legacy
EFFICIENT$1.50$2.004K
$1.65
Completion-only
Anthropic
Claude Haiku 4.5
Efficient
EFFICIENT$0.8000$4.00200K
$1.76
Fast & affordable
Anthropic
Claude Haiku 3.5
Previous Gen
EFFICIENT$0.8000$4.00200K
$1.76
Previous Haiku
Reka AI
Reka Flash 21B
Efficient
EFFICIENT$0.8000$4.00128K
$1.76
Fast multimodal
Moonshot AI
Moonshot v1 128k
Long Context
FRONTIER$1.80$1.80128K
$1.80
128K window
Cohere
Rerank v3.5
Reranker
EMBEDDING$2.00
$2.00
Per search unit
Cohere
Rerank v3 Multilingual
Reranker
EMBEDDING$2.00
$2.00
Multilingual rerank
Qwen
Qwen2-VL-72B
Vision Large
MULTIMODAL$2.00$2.0032K
$2.00
72B vision-language
Qwen
Qwen1.5-110B-Chat
Legacy Large
FRONTIER$2.00$2.0032K
$2.00
Qwen 1.5 large
OpenAI
o3-mini
Reasoning
REASONING$1.10$4.40200K
$2.09
Efficient reasoning
OpenAI
o4-mini
Reasoning
REASONING$1.10$4.40200K
$2.09
Latest mini reasoner
Writer
Palmyra Vision
Vision
MULTIMODAL$1.50$3.5032K
$2.10
Multimodal Palmyra
Perplexity
Sonar Reasoning
Reasoning
REASONING$1.00$5.00127K
$2.20
Reasoning search
MiniMax
ABAB 6.5s
Balanced
FRONTIER$1.00$5.008K
$2.20
Latest ABAB
Google
Gemini 1.5 Pro
Previous Gen
FRONTIER$1.25$5.002M
$2.38
2M context
Upstage
Solar Pro
Frontier
FRONTIER$1.50$6.0032K
$2.85
Korean AI flagship
Groq
Llama 3.1 405B (Groq)
Fast Large
FRONTIER$2.99$2.99128K
$2.99
405B on LPU
Fireworks AI
Llama 3.1 405B (fw)
Serverless
FRONTIER$3.00$3.00131K
$3.00
405B serverless
AI21 Labs
Jurassic-2 Mid
Legacy
EFFICIENT$3.00$3.008K
$3.00
Mid-tier J2
01.AI
Yi-Large
Frontier
FRONTIER$3.00$3.0032K
$3.00
Yi flagship
Qwen
Qwen2.5-Max
Flagship
FRONTIER$1.60$6.4032K
$3.04
Alibaba flagship
Mistral
Mistral Large 2
Frontier
FRONTIER$2.00$6.00128K
$3.20
Top European model
Mistral
Pixtral Large
Vision
MULTIMODAL$2.00$6.00128K
$3.20
Multimodal frontier
Nvidia
Mistral-Large-2 (NIM)
Enterprise
FRONTIER$2.00$6.00128K
$3.20
Mistral via NIM
Together AI
Llama 3.1 405B (Together)
Managed Large
FRONTIER$3.50$3.50128K
$3.50
405B managed
01.AI
Yi-VL-34B
Vision Large
MULTIMODAL$3.50$3.504K
$3.50
Yi Vision 34B
OpenAI
GPT-4.1
Frontier
MULTIMODAL$2.00$8.001M
$3.80
1M context, strong coding
Perplexity
Sonar Reasoning Pro
Reasoning
REASONING$2.00$8.00127K
$3.80
R1 + web search
Perplexity
Sonar Deep Research
Research
REASONING$2.00$8.00127K
$3.80
Agentic research
AI21 Labs
Jamba 1.5 Large
Frontier
FRONTIER$2.00$8.00256K
$3.80
SSM + Transformer
OpenAI
GPT-5
Frontier
FRONTIER$1.25$10.00400K
$3.88
Flagship reasoning + vision
Google
Gemini 2.5 Pro
Frontier
FRONTIER$1.25$10.001M
$3.88
1M ctx, thinking
Hyperbolic
Llama 3.1 405B (Hyp)
Fast
FRONTIER$4.00$4.00128K
$4.00
Fast 405B
Together AI
DeepSeek-R1 (Together)
Reasoning
REASONING$3.00$7.0064K
$4.20
R1 managed
Nvidia
Nemotron-4-340B
Large
FRONTIER$4.20$4.204K
$4.20
340B NVIDIA
xAI
Grok 2
Previous
FRONTIER$2.00$10.00131K
$4.40
Previous Grok
xAI
Grok 2 Vision
Vision
MULTIMODAL$2.00$10.0032K
$4.40
Vision model
Fireworks AI
DeepSeek-R1 (fw)
Reasoning
REASONING$3.00$8.0064K
$4.50
R1 serverless
Tencent
Hunyuan-Turbo
Frontier
FRONTIER$3.50$7.00256K
$4.55
Tencent flagship
OpenAI
GPT-4o
Flagship
MULTIMODAL$2.50$10.00128K
$4.75
Omni multimodal (retired ChatGPT Apr 2026)
Cohere
Command A
Frontier
FRONTIER$2.50$10.00256K
$4.75
Latest Command
Cohere
Command R+
Previous
FRONTIER$2.50$10.00128K
$4.75
Enterprise RAG
Moonshot AI
Kimi k1.5
Reasoning
REASONING$2.50$10.00128K
$4.75
Long-context reasoning
Inflection AI
Pi (Inflection-2.5)
Conversational
FRONTIER$2.50$10.0032K
$4.75
Empathetic AI
Google
PaLM 2 Unicorn
Legacy
FRONTIER$5.00$5.008K
$5.00
Large PaLM 2
Perplexity
Sonar Huge
Large
FRONTIER$5.00$5.00127K
$5.00
405B based
TII / Falcon
Falcon 180B
Large Open
FRONTIER$3.50$10.004K
$5.45
Largest Falcon
Amazon
Nova Premier
Top-tier
FRONTIER$2.50$12.50300K
$5.50
Highest intelligence
Tencent
Hunyuan-Vision
Vision
MULTIMODAL$5.50$5.504K
$5.50
Visual understanding
Baidu
ERNIE 4.0 Turbo 8K
Frontier
FRONTIER$3.50$10.508K
$5.60
Baidu flagship
OpenAI
o1-mini
Reasoning
REASONING$3.00$12.00128K
$5.70
Mini o-series
Anthropic
Claude Sonnet 4.6
Balanced
MULTIMODAL$3.00$15.001M
$6.60
Flagship perf at 1/5 cost
Anthropic
Claude Sonnet 4.5
Balanced
MULTIMODAL$3.00$15.00200K
$6.60
Vision + extended thinking
Anthropic
Claude Sonnet 3.7
Previous Gen
REASONING$3.00$15.00200K
$6.60
Hybrid reasoning
Anthropic
Claude Sonnet 3.5
Previous Gen
MULTIMODAL$3.00$15.00200K
$6.60
Strong at coding
xAI
Grok 3
Frontier
FRONTIER$3.00$15.00131K
$6.60
Real-time X data
Perplexity
Sonar Pro
Frontier
FRONTIER$3.00$15.00200K
$6.60
With web search
Reka AI
Reka Core
Frontier
FRONTIER$3.00$15.00128K
$6.60
Reka flagship
Baidu
ERNIE 4.0 8K
Frontier
FRONTIER$4.20$12.608K
$6.72
Standard ERNIE 4.0
Zhipu AI
GLM-4-Plus
Frontier
FRONTIER$7.00$7.00128K
$7.00
Zhipu flagship
Zhipu AI
GLM-4V
Vision
MULTIMODAL$7.00$7.002K
$7.00
Vision model
xAI
Grok 1.5
Legacy
FRONTIER$5.00$15.00128K
$8.00
Early Grok
Replicate
Llama 3.1 405B (Rep)
Serverless
FRONTIER$9.50$9.50128K
$9.50
Per-token serverless
Anthropic
Claude Opus 4.6
Frontier
FRONTIER$5.00$25.001M
$11.00
Flagship, 1M context GA
xAI
Grok 3 Fast
Balanced
MULTIMODAL$5.00$25.00131K
$11.00
High throughput
Google
Gemini 1.0 Ultra
Legacy
FRONTIER$7.50$22.5032K
$12.00
Original Ultra
Anthropic
Claude 2.1
Legacy
FRONTIER$8.00$24.00200K
$12.80
Extended context v2
Anthropic
Claude 2
Legacy
FRONTIER$8.00$24.00100K
$12.80
Classic Claude 2
AI21 Labs
Jurassic-2 Ultra
Legacy
FRONTIER$15.00$15.008K
$15.00
Classic AI21 model
OpenAI
GPT-4 Turbo
Previous Gen
MULTIMODAL$10.00$30.00128K
$16.00
Vision capable
OpenAI
GPT-4 Turbo Preview
Previous Gen
FRONTIER$10.00$30.00128K
$16.00
Preview version
OpenAI
o3
Reasoning
REASONING$10.00$40.00200K
$19.00
Advanced reasoning
OpenAI
o1
Reasoning
REASONING$15.00$60.00200K
$28.50
First o-series
OpenAI
o1-preview
Reasoning
REASONING$15.00$60.00128K
$28.50
o1 preview
Anthropic
Claude Opus 4.5
Previous
FRONTIER$15.00$75.00200K
$33.00
State-of-the-art
Anthropic
Claude Opus 3
Classic
FRONTIER$15.00$75.00200K
$33.00
Claude 3 flagship
OpenAI
GPT-4
Classic
FRONTIER$30.00$60.008K
$39.00
Original GPT-4
OpenAI
GPT-4-32k
Classic
FRONTIER$60.00$120.0032K
$78.00
Extended context
OpenAI
GPT-4.5
Near-Frontier
FRONTIER$75.00$150.00128K
$97.50
Most capable GPT-4