Whisp supports multiple AI providers including OpenAI, Anthropic, Ollama, Gemini, and Cerebras.

Providers & Models

Whisp supports 5 AI providers, giving you flexibility to choose based on your needs.

Choosing a Model

Pick based on your primary use case:

Use Case	Recommended	Why
Daily shell commands	`gpt-4o-mini`, `gemini-1.5-flash`	Fast, cheap, good quality
Complex debugging	`claude-sonnet`, `gpt-4o`	Better reasoning for hard problems
Privacy-sensitive work	`llama3.2` (Ollama)	All data stays local
Offline/air-gapped	`llama3.2` (Ollama)	No network required
High volume automation	`gemini-1.5-flash`	Lowest cost per request
Best quality (cost no object)	`claude-opus`, `gpt-4o`	Maximum capability

Context Windows

Context window determines how much information whisp can send to the model:

Model	Context Window
gpt-4o	128K tokens
gpt-4o-mini	128K tokens
claude-sonnet	200K tokens
claude-opus	200K tokens
gemini-1.5-flash	1M tokens
gemini-1.5-pro	2M tokens
llama3.2	8K tokens

For whisp shell commands, even 8K is plenty. Larger windows matter for:

Long piped input (log files, code analysis)
Extended chat conversations
Complex multi-file context

Provider Comparison

Provider	Type	Default Model	API Key Required	Best For
OpenAI	Cloud	`gpt-5-nano-2025-08-07`	Yes	Fast, reliable, good balance
Anthropic	Cloud	`claude-sonnet-4-20250514`	Yes	Complex reasoning, safety
Ollama	Local	`llama3.2`	No	Privacy, offline use
Gemini	Cloud	`gemini-1.5-flash`	Yes	Large context, multimodal
Cerebras	Cloud	`gpt-oss-120b`	Yes	Fast inference

OpenAI

The default provider. Excellent balance of speed, quality, and reliability.

Setup

export OPENAI_API_KEY="sk-..."

Or in ~/.config/whisp/config.toml:

default_provider = "openai"

[providers.openai]
api_key = "sk-..."
model = "gpt-5-nano-2025-08-07"

Available Models

Model	Speed	Quality	Best For
`gpt-5-nano-2025-08-07`	Fast	Good	Daily use (default)
`gpt-4o`	Fast	Excellent	Complex commands
`gpt-4o-mini`	Fast	Good	Budget-friendly

Get your API key at platform.openai.com

Anthropic (Claude)

Claude models excel at complex reasoning and following nuanced instructions.

Setup

export ANTHROPIC_API_KEY="sk-ant-..."
export WHISP_PROVIDER="anthropic"

Or in config:

default_provider = "anthropic"

[providers.anthropic]
api_key = "sk-ant-..."
model = "claude-sonnet-4-20250514"

Available Models

Model	Speed	Quality	Best For
`claude-sonnet-4-20250514`	Fast	Excellent	Daily use (default)
`claude-opus-4-20250514`	Slower	Best	Complex analysis
`claude-haiku-3-20240307`	Fastest	Good	Simple queries

Get your API key at console.anthropic.com

Ollama (Local)

Run AI models locally for privacy and offline use. No API key required.

Setup

Install Ollama from ollama.ai
Pull a model: ollama pull llama3.2
Configure whisp:

export WHISP_PROVIDER="ollama"
export OLLAMA_URL="http://localhost:11434"  # default

Or in config:

default_provider = "ollama"

[providers.ollama]
url = "http://localhost:11434"
model = "llama3.2"

Recommended Models

Model	Size	Quality	Best For
`llama3.2`	3B	Good	Fast, low resources (default)
`llama3.2:7b`	7B	Better	Balanced
`codellama`	7B	Good	Code-focused
`mistral`	7B	Good	General use

Pull models with: ollama pull <model-name>

Google Gemini

Google's AI with large context windows and multimodal capabilities.

Setup

export GOOGLE_API_KEY="..."
export WHISP_PROVIDER="gemini"

Or in config:

default_provider = "gemini"

[providers.gemini]
api_key = "..."
model = "gemini-1.5-flash"

Available Models

Model	Speed	Quality	Best For
`gemini-1.5-flash`	Fast	Good	Daily use (default)
`gemini-1.5-pro`	Medium	Excellent	Complex queries
`gemini-2.0-flash`	Fast	Excellent	Latest features

Get your API key at aistudio.google.com

Cerebras

High-speed inference for quick responses.

Setup

export CEREBRAS_API_KEY="csk-..."
export WHISP_PROVIDER="cerebras"

Or in config:

default_provider = "cerebras"

[providers.cerebras]
api_key = "csk-..."
model = "gpt-oss-120b"

Available Models

Model	Speed	Quality	Best For
`gpt-oss-120b`	Very Fast	Good	Daily use (default)

Get your API key at cloud.cerebras.ai

Switching Providers

Using the CLI

# Switch to Anthropic
whisp config set provider anthropic

# Switch to Ollama for local use
whisp config set provider ollama

# Check current provider
whisp config get provider

After switching, restart the daemon:

whisp restart

Using whisp providers

List all providers and their status:

whisp providers

Output shows which provider is active and API key status:

  ➜ OpenAI         gpt-5-nano-2025-08-07    API key: ✓
    Anthropic      claude-sonnet-4-20250514 API key: ✗
    Ollama         llama3.2                 (no key needed)
    Gemini         gemini-1.5-flash         API key: ✗
    Cerebras       gpt-oss-120b             API key: ✗

Changing Models

Change the model for your current provider:

# Set model
whisp config set model gpt-4o

# Check current model
whisp config get model

Or use environment variable for temporary override:

WHISP_MODEL=gpt-4o , explain this complex codebase

Cost Considerations

Cloud providers charge per token. Pricing is based on input (what you send) and output (what you receive) tokens.

Per-Token Pricing

Provider	Model	Input $/1M	Output $/1M
OpenAI	gpt-5-nano	~$0.10	~$0.40
OpenAI	gpt-4o-mini	~$0.15	~$0.60
OpenAI	gpt-4o	~$2.50	~$10.00
Anthropic	claude-haiku	~$0.25	~$1.25
Anthropic	claude-sonnet	~$3.00	~$15.00
Anthropic	claude-opus	~$15.00	~$75.00
Gemini	gemini-1.5-flash	~$0.075	~$0.30
Gemini	gemini-1.5-pro	~$1.25	~$5.00
Cerebras	gpt-oss-120b	Varies	Varies
Ollama	any	Free	Free

Estimated Cost per 1000 Requests

Based on typical whisp usage (~500 tokens/request):

Provider	Model	Est. Cost
Gemini	gemini-1.5-flash	~$0.04
OpenAI	gpt-5-nano	~$0.05
OpenAI	gpt-4o-mini	~$0.08
Anthropic	claude-haiku	~$0.13
Anthropic	claude-sonnet	~$1.50
OpenAI	gpt-4o	~$2.50
Ollama	any	Free (local)

Track your usage with whisp metrics to monitor token consumption.

Prices change frequently. Check provider pricing pages for current rates. Whisp automatically fetches current pricing from LiteLLM's database.

Providers & Models

Whisp supports 5 AI providers, giving you flexibility to choose based on your needs.

Choosing a Model

Pick based on your primary use case:

Use Case	Recommended	Why
Daily shell commands	`gpt-4o-mini`, `gemini-1.5-flash`	Fast, cheap, good quality
Complex debugging	`claude-sonnet`, `gpt-4o`	Better reasoning for hard problems
Privacy-sensitive work	`llama3.2` (Ollama)	All data stays local
Offline/air-gapped	`llama3.2` (Ollama)	No network required
High volume automation	`gemini-1.5-flash`	Lowest cost per request
Best quality (cost no object)	`claude-opus`, `gpt-4o`	Maximum capability

Context Windows

Context window determines how much information whisp can send to the model:

Model	Context Window
gpt-4o	128K tokens
gpt-4o-mini	128K tokens
claude-sonnet	200K tokens
claude-opus	200K tokens
gemini-1.5-flash	1M tokens
gemini-1.5-pro	2M tokens
llama3.2	8K tokens

For whisp shell commands, even 8K is plenty. Larger windows matter for:

Long piped input (log files, code analysis)
Extended chat conversations
Complex multi-file context

Provider Comparison

Provider	Type	Default Model	API Key Required	Best For
OpenAI	Cloud	`gpt-5-nano-2025-08-07`	Yes	Fast, reliable, good balance
Anthropic	Cloud	`claude-sonnet-4-20250514`	Yes	Complex reasoning, safety
Ollama	Local	`llama3.2`	No	Privacy, offline use
Gemini	Cloud	`gemini-1.5-flash`	Yes	Large context, multimodal
Cerebras	Cloud	`gpt-oss-120b`	Yes	Fast inference

OpenAI

The default provider. Excellent balance of speed, quality, and reliability.

Setup

export OPENAI_API_KEY="sk-..."

Or in ~/.config/whisp/config.toml:

default_provider = "openai"

[providers.openai]
api_key = "sk-..."
model = "gpt-5-nano-2025-08-07"

Available Models

Model	Speed	Quality	Best For
`gpt-5-nano-2025-08-07`	Fast	Good	Daily use (default)
`gpt-4o`	Fast	Excellent	Complex commands
`gpt-4o-mini`	Fast	Good	Budget-friendly

Get your API key at platform.openai.com

Anthropic (Claude)

Claude models excel at complex reasoning and following nuanced instructions.

Setup

export ANTHROPIC_API_KEY="sk-ant-..."
export WHISP_PROVIDER="anthropic"

Or in config:

default_provider = "anthropic"

[providers.anthropic]
api_key = "sk-ant-..."
model = "claude-sonnet-4-20250514"

Available Models

Model	Speed	Quality	Best For
`claude-sonnet-4-20250514`	Fast	Excellent	Daily use (default)
`claude-opus-4-20250514`	Slower	Best	Complex analysis
`claude-haiku-3-20240307`	Fastest	Good	Simple queries

Get your API key at console.anthropic.com

Ollama (Local)

Run AI models locally for privacy and offline use. No API key required.

Setup

Install Ollama from ollama.ai
Pull a model: ollama pull llama3.2
Configure whisp:

export WHISP_PROVIDER="ollama"
export OLLAMA_URL="http://localhost:11434"  # default

Or in config:

default_provider = "ollama"

[providers.ollama]
url = "http://localhost:11434"
model = "llama3.2"

Recommended Models

Model	Size	Quality	Best For
`llama3.2`	3B	Good	Fast, low resources (default)
`llama3.2:7b`	7B	Better	Balanced
`codellama`	7B	Good	Code-focused
`mistral`	7B	Good	General use

Pull models with: ollama pull <model-name>

Google Gemini

Google's AI with large context windows and multimodal capabilities.

Setup

export GOOGLE_API_KEY="..."
export WHISP_PROVIDER="gemini"

Or in config:

default_provider = "gemini"

[providers.gemini]
api_key = "..."
model = "gemini-1.5-flash"

Available Models

Model	Speed	Quality	Best For
`gemini-1.5-flash`	Fast	Good	Daily use (default)
`gemini-1.5-pro`	Medium	Excellent	Complex queries
`gemini-2.0-flash`	Fast	Excellent	Latest features

Get your API key at aistudio.google.com

Cerebras

High-speed inference for quick responses.

Setup

export CEREBRAS_API_KEY="csk-..."
export WHISP_PROVIDER="cerebras"

Or in config:

default_provider = "cerebras"

[providers.cerebras]
api_key = "csk-..."
model = "gpt-oss-120b"

Available Models

Model	Speed	Quality	Best For
`gpt-oss-120b`	Very Fast	Good	Daily use (default)

Get your API key at cloud.cerebras.ai

Switching Providers

Using the CLI

# Switch to Anthropic
whisp config set provider anthropic

# Switch to Ollama for local use
whisp config set provider ollama

# Check current provider
whisp config get provider

After switching, restart the daemon:

whisp restart

Using whisp providers

List all providers and their status:

whisp providers

Output shows which provider is active and API key status:

  ➜ OpenAI         gpt-5-nano-2025-08-07    API key: ✓
    Anthropic      claude-sonnet-4-20250514 API key: ✗
    Ollama         llama3.2                 (no key needed)
    Gemini         gemini-1.5-flash         API key: ✗
    Cerebras       gpt-oss-120b             API key: ✗

Changing Models

Change the model for your current provider:

# Set model
whisp config set model gpt-4o

# Check current model
whisp config get model

Or use environment variable for temporary override:

WHISP_MODEL=gpt-4o , explain this complex codebase

Cost Considerations

Cloud providers charge per token. Pricing is based on input (what you send) and output (what you receive) tokens.

Per-Token Pricing

Provider	Model	Input $/1M	Output $/1M
OpenAI	gpt-5-nano	~$0.10	~$0.40
OpenAI	gpt-4o-mini	~$0.15	~$0.60
OpenAI	gpt-4o	~$2.50	~$10.00
Anthropic	claude-haiku	~$0.25	~$1.25
Anthropic	claude-sonnet	~$3.00	~$15.00
Anthropic	claude-opus	~$15.00	~$75.00
Gemini	gemini-1.5-flash	~$0.075	~$0.30
Gemini	gemini-1.5-pro	~$1.25	~$5.00
Cerebras	gpt-oss-120b	Varies	Varies
Ollama	any	Free	Free

Estimated Cost per 1000 Requests

Based on typical whisp usage (~500 tokens/request):

Provider	Model	Est. Cost
Gemini	gemini-1.5-flash	~$0.04
OpenAI	gpt-5-nano	~$0.05
OpenAI	gpt-4o-mini	~$0.08
Anthropic	claude-haiku	~$0.13
Anthropic	claude-sonnet	~$1.50
OpenAI	gpt-4o	~$2.50
Ollama	any	Free (local)

Track your usage with whisp metrics to monitor token consumption.

Prices change frequently. Check provider pricing pages for current rates. Whisp automatically fetches current pricing from LiteLLM's database.

Providers & Models

On this page

Providers & Models

On this page