Providers & Models
Whisp supports multiple AI providers including OpenAI, Anthropic, Ollama, Gemini, and Cerebras.
Providers & Models
Whisp supports 5 AI providers, giving you flexibility to choose based on your needs.
Choosing a Model
Pick based on your primary use case:
| Use Case | Recommended | Why |
|---|---|---|
| Daily shell commands | gpt-4o-mini, gemini-1.5-flash | Fast, cheap, good quality |
| Complex debugging | claude-sonnet, gpt-4o | Better reasoning for hard problems |
| Privacy-sensitive work | llama3.2 (Ollama) | All data stays local |
| Offline/air-gapped | llama3.2 (Ollama) | No network required |
| High volume automation | gemini-1.5-flash | Lowest cost per request |
| Best quality (cost no object) | claude-opus, gpt-4o | Maximum capability |
Context Windows
Context window determines how much information whisp can send to the model:
| Model | Context Window |
|---|---|
| gpt-4o | 128K tokens |
| gpt-4o-mini | 128K tokens |
| claude-sonnet | 200K tokens |
| claude-opus | 200K tokens |
| gemini-1.5-flash | 1M tokens |
| gemini-1.5-pro | 2M tokens |
| llama3.2 | 8K tokens |
For whisp shell commands, even 8K is plenty. Larger windows matter for:
- Long piped input (log files, code analysis)
- Extended chat conversations
- Complex multi-file context
Provider Comparison
| Provider | Type | Default Model | API Key Required | Best For |
|---|---|---|---|---|
| OpenAI | Cloud | gpt-5-nano-2025-08-07 | Yes | Fast, reliable, good balance |
| Anthropic | Cloud | claude-sonnet-4-20250514 | Yes | Complex reasoning, safety |
| Ollama | Local | llama3.2 | No | Privacy, offline use |
| Gemini | Cloud | gemini-1.5-flash | Yes | Large context, multimodal |
| Cerebras | Cloud | gpt-oss-120b | Yes | Fast inference |
OpenAI
The default provider. Excellent balance of speed, quality, and reliability.
Setup
export OPENAI_API_KEY="sk-..."Or in ~/.config/whisp/config.toml:
default_provider = "openai"
[providers.openai]
api_key = "sk-..."
model = "gpt-5-nano-2025-08-07"Available Models
| Model | Speed | Quality | Best For |
|---|---|---|---|
gpt-5-nano-2025-08-07 | Fast | Good | Daily use (default) |
gpt-4o | Fast | Excellent | Complex commands |
gpt-4o-mini | Fast | Good | Budget-friendly |
Get your API key at platform.openai.com
Anthropic (Claude)
Claude models excel at complex reasoning and following nuanced instructions.
Setup
export ANTHROPIC_API_KEY="sk-ant-..."
export WHISP_PROVIDER="anthropic"Or in config:
default_provider = "anthropic"
[providers.anthropic]
api_key = "sk-ant-..."
model = "claude-sonnet-4-20250514"Available Models
| Model | Speed | Quality | Best For |
|---|---|---|---|
claude-sonnet-4-20250514 | Fast | Excellent | Daily use (default) |
claude-opus-4-20250514 | Slower | Best | Complex analysis |
claude-haiku-3-20240307 | Fastest | Good | Simple queries |
Get your API key at console.anthropic.com
Ollama (Local)
Run AI models locally for privacy and offline use. No API key required.
Setup
- Install Ollama from ollama.ai
- Pull a model:
ollama pull llama3.2 - Configure whisp:
export WHISP_PROVIDER="ollama"
export OLLAMA_URL="http://localhost:11434" # defaultOr in config:
default_provider = "ollama"
[providers.ollama]
url = "http://localhost:11434"
model = "llama3.2"Recommended Models
| Model | Size | Quality | Best For |
|---|---|---|---|
llama3.2 | 3B | Good | Fast, low resources (default) |
llama3.2:7b | 7B | Better | Balanced |
codellama | 7B | Good | Code-focused |
mistral | 7B | Good | General use |
Pull models with: ollama pull <model-name>
Google Gemini
Google's AI with large context windows and multimodal capabilities.
Setup
export GOOGLE_API_KEY="..."
export WHISP_PROVIDER="gemini"Or in config:
default_provider = "gemini"
[providers.gemini]
api_key = "..."
model = "gemini-1.5-flash"Available Models
| Model | Speed | Quality | Best For |
|---|---|---|---|
gemini-1.5-flash | Fast | Good | Daily use (default) |
gemini-1.5-pro | Medium | Excellent | Complex queries |
gemini-2.0-flash | Fast | Excellent | Latest features |
Get your API key at aistudio.google.com
Cerebras
High-speed inference for quick responses.
Setup
export CEREBRAS_API_KEY="csk-..."
export WHISP_PROVIDER="cerebras"Or in config:
default_provider = "cerebras"
[providers.cerebras]
api_key = "csk-..."
model = "gpt-oss-120b"Available Models
| Model | Speed | Quality | Best For |
|---|---|---|---|
gpt-oss-120b | Very Fast | Good | Daily use (default) |
Get your API key at cloud.cerebras.ai
Switching Providers
Using the CLI
# Switch to Anthropic
whisp config set provider anthropic
# Switch to Ollama for local use
whisp config set provider ollama
# Check current provider
whisp config get providerAfter switching, restart the daemon:
whisp restartUsing whisp providers
List all providers and their status:
whisp providersOutput shows which provider is active and API key status:
➜ OpenAI gpt-5-nano-2025-08-07 API key: ✓
Anthropic claude-sonnet-4-20250514 API key: ✗
Ollama llama3.2 (no key needed)
Gemini gemini-1.5-flash API key: ✗
Cerebras gpt-oss-120b API key: ✗Changing Models
Change the model for your current provider:
# Set model
whisp config set model gpt-4o
# Check current model
whisp config get modelOr use environment variable for temporary override:
WHISP_MODEL=gpt-4o , explain this complex codebaseCost Considerations
Cloud providers charge per token. Pricing is based on input (what you send) and output (what you receive) tokens.
Per-Token Pricing
| Provider | Model | Input $/1M | Output $/1M |
|---|---|---|---|
| OpenAI | gpt-5-nano | ~$0.10 | ~$0.40 |
| OpenAI | gpt-4o-mini | ~$0.15 | ~$0.60 |
| OpenAI | gpt-4o | ~$2.50 | ~$10.00 |
| Anthropic | claude-haiku | ~$0.25 | ~$1.25 |
| Anthropic | claude-sonnet | ~$3.00 | ~$15.00 |
| Anthropic | claude-opus | ~$15.00 | ~$75.00 |
| Gemini | gemini-1.5-flash | ~$0.075 | ~$0.30 |
| Gemini | gemini-1.5-pro | ~$1.25 | ~$5.00 |
| Cerebras | gpt-oss-120b | Varies | Varies |
| Ollama | any | Free | Free |
Estimated Cost per 1000 Requests
Based on typical whisp usage (~500 tokens/request):
| Provider | Model | Est. Cost |
|---|---|---|
| Gemini | gemini-1.5-flash | ~$0.04 |
| OpenAI | gpt-5-nano | ~$0.05 |
| OpenAI | gpt-4o-mini | ~$0.08 |
| Anthropic | claude-haiku | ~$0.13 |
| Anthropic | claude-sonnet | ~$1.50 |
| OpenAI | gpt-4o | ~$2.50 |
| Ollama | any | Free (local) |
Track your usage with whisp metrics to monitor token consumption.
Prices change frequently. Check provider pricing pages for current rates. Whisp automatically fetches current pricing from LiteLLM's database.