Model Pricing and Supported Providers

This page documents the model pricing chart used by NexaGauge cost estimation.

All prices below are in USD per 1,000 tokens.

Provider Pricing Links

OpenAI pricing: https://openai.com/api/pricing/
OpenAI model docs:
- GPT-4o: https://developers.openai.com/api/docs/models/gpt-4o
- GPT-4o mini: https://developers.openai.com/api/docs/models/gpt-4o-mini
- GPT-4 Turbo: https://developers.openai.com/api/docs/models/gpt-4-turbo
- GPT-4: https://developers.openai.com/api/docs/models/gpt-4
- GPT-3.5 Turbo: https://developers.openai.com/api/docs/models/gpt-3.5-turbo
- o1: https://developers.openai.com/api/docs/models/o1
- o1-mini: https://developers.openai.com/api/docs/models/o1-mini
Anthropic pricing: https://www.anthropic.com/pricing#api
Anthropic models/docs: https://docs.anthropic.com/en/docs/about-claude/models
Google Gemini API pricing: https://ai.google.dev/gemini-api/docs/pricing

Cost Chart (Used by NexaGauge)

OpenAI

Model	Input token / 1000 (USD)	Output token / 1000 (USD)
`openai/gpt-4o`	0.002500	0.010000
`openai/gpt-4o-mini`	0.000150	0.000600
`openai/gpt-4-turbo`	0.010000	0.030000
`openai/gpt-4`	0.030000	0.060000
`openai/gpt-3.5-turbo`	0.000500	0.001500
`openai/o1`	0.015000	0.060000
`openai/o1-mini`	0.001100	0.004400

Anthropic

Model	Input token / 1000 (USD)	Output token / 1000 (USD)
`anthropic/claude-opus-4-6`	0.015000	0.075000
`anthropic/claude-sonnet-4-6`	0.003000	0.015000
`anthropic/claude-3-5-sonnet-20241022`	0.003000	0.015000
`anthropic/claude-3-5-haiku-20241022`	0.001000	0.005000
`anthropic/claude-3-haiku-20240307`	0.000250	0.001250
`anthropic/claude-3-opus-20240229`	0.015000	0.075000

Gemini

Model	Input token / 1000 (USD)	Output token / 1000 (USD)
`gemini/gemini-2.0-flash`	0.000100	0.000400
`gemini/gemini-2.0-flash-lite`	0.000075	0.000300
`gemini/gemini-2.5-flash`	0.000300	0.002500
`gemini/gemini-2.5-flash-lite`	0.000100	0.000400

Notes

Prices are maintained in code at packages/nexagauge-graph/ng_graph/llm/pricing.py.
If a model is not in this curated chart, NexaGauge tries a tokencost lookup and then falls back to a conservative default rate.
For llama.cpp and other OpenAI-compatible endpoint routing via CLI flags, see Self-Hosted Endpoints.