Config File
Configure whisp using the config file at ~/.config/whisp/config.toml.
Config File
For persistent configuration, use the config file at ~/.config/whisp/config.toml.
Location
Whisp loads configuration from ~/.config/whisp/config.toml.
Environment variables override config file values.
Full Example
# ~/.config/whisp/config.toml
# Default provider: openai, anthropic, ollama, gemini, cerebras
default_provider = "openai"
# Provider-specific configuration
[providers.openai]
api_key = "sk-..." # Or use OPENAI_API_KEY env var
model = "gpt-5-nano-2025-08-07" # Default model
# base_url = "https://custom.api.com" # Optional: custom endpoint
[providers.anthropic]
api_key = "sk-ant-..." # Or use ANTHROPIC_API_KEY env var
model = "claude-sonnet-4-20250514" # Default model
# base_url = "https://api.anthropic.com"
[providers.ollama]
url = "http://localhost:11434" # Ollama server URL
model = "llama3.2" # Default model
[providers.gemini]
api_key = "..." # Or use GOOGLE_API_KEY env var
model = "gemini-1.5-flash" # Default model
[providers.cerebras]
api_key = "csk-..." # Or use CEREBRAS_API_KEY env var
model = "gpt-oss-120b" # Default model
# Resilience settings (rate limiting and retries)
[resilience]
enabled = true # Enable retry logic
max_retries = 3 # Maximum retry attempts
initial_delay_ms = 1000 # Initial retry delay (ms)
max_delay_ms = 30000 # Maximum delay between retries (ms)
requests_per_minute = 60 # Rate limit (omit for unlimited)
# Provider-specific resilience overrides
[providers.openai.resilience]
max_retries = 5
requests_per_minute = 120
[providers.ollama.resilience]
enabled = false # No retries needed for localSection Reference
Root Level
| Key | Type | Default | Description |
|---|---|---|---|
default_provider | string | openai | Active AI provider |
[providers.openai]
| Key | Type | Default | Description |
|---|---|---|---|
api_key | string | — | OpenAI API key |
model | string | gpt-5-nano-2025-08-07 | Model to use |
base_url | string | — | Custom API endpoint |
[providers.anthropic]
| Key | Type | Default | Description |
|---|---|---|---|
api_key | string | — | Anthropic API key |
model | string | claude-sonnet-4-20250514 | Model to use |
base_url | string | https://api.anthropic.com | API endpoint |
[providers.ollama]
| Key | Type | Default | Description |
|---|---|---|---|
url | string | http://localhost:11434 | Ollama server URL |
model | string | llama3.2 | Model to use |
[providers.gemini]
| Key | Type | Default | Description |
|---|---|---|---|
api_key | string | — | Google API key |
model | string | gemini-1.5-flash | Model to use |
base_url | string | — | Custom API endpoint |
[providers.cerebras]
| Key | Type | Default | Description |
|---|---|---|---|
api_key | string | — | Cerebras API key |
model | string | gpt-oss-120b | Model to use |
base_url | string | https://api.cerebras.ai | API endpoint |
[resilience]
Configure retry behavior and rate limiting:
| Key | Type | Default | Description |
|---|---|---|---|
enabled | bool | true | Enable resilience features |
max_retries | int | 3 | Maximum retry attempts |
initial_delay_ms | int | 1000 | Initial retry delay (ms) |
max_delay_ms | int | 30000 | Max delay between retries (ms) |
requests_per_minute | int | — | Rate limit (omit for unlimited) |
Retries use exponential backoff with jitter.
Retryable errors (will retry up to max_retries times):
- HTTP 429 (rate limit exceeded)
- HTTP 5xx (server errors: 500, 502, 503, 504)
- Connection timeouts
- Network connection failures
Non-retryable errors (fail immediately):
- HTTP 400 (bad request)
- HTTP 401 (unauthorized - check your API key)
- HTTP 403 (forbidden)
- HTTP 404 (not found - check model name)
Backoff strategy: Each retry waits longer than the last. The delay starts at initial_delay_ms and doubles each attempt (with random jitter), up to max_delay_ms. For example, with defaults: 1s → 2s → 4s → ... → max 30s.
Resilience Examples
High-throughput setup (for heavy users):
[resilience]
enabled = true
max_retries = 5
requests_per_minute = 200Budget-conscious setup (minimize retries to avoid extra costs):
[resilience]
enabled = true
max_retries = 1
initial_delay_ms = 2000Local-first with cloud backup:
default_provider = "ollama"
[providers.ollama.resilience]
enabled = false # Local, no retries needed
[providers.openai.resilience]
enabled = true
max_retries = 3Troubleshooting 429 Errors
If you're hitting rate limits:
- Lower your rate limit: Reduce
requests_per_minute - Increase retry delays: Bump
initial_delay_msandmax_delay_ms - Switch to local: Use Ollama for unlimited requests
- Upgrade API tier: Some providers have higher limits on paid tiers
# Conservative rate limiting to avoid 429s
[resilience]
requests_per_minute = 30
initial_delay_ms = 2000
max_delay_ms = 60000Provider-Specific Notes
- Ollama: Defaults to
enabled: false(local provider doesn't need retries) - OpenAI: Consider rate limits based on your API tier
- Anthropic: Has stricter rate limits on free tier
Switching Providers
Change your active provider:
# Using CLI
whisp config set provider anthropic
# Or edit config.toml directly
default_provider = "anthropic"After changing providers, restart the daemon:
whisp restartValidating Configuration
Check your configuration for errors:
whisp config validateThis verifies:
- Config file syntax is valid TOML
- Provider is recognized
- API key is configured for the selected provider
- Model is set
Security Note
Your config file contains API keys. Ensure proper permissions:
chmod 600 ~/.config/whisp/config.tomlWhisp will warn you if the config file is readable by others.