Configuration Reference¶
Complete configuration guide for StrataRouter Core and Runtime.
Configuration Files¶
File Structure¶
config/
├── default.toml # Default configuration
├── development.toml # Development overrides
├── production.toml # Production overrides
└── test.toml # Test configuration
Loading Configuration¶
// Rust
use stratarouter_runtime::Config;
// Load with environment
let config = Config::load("production")?;
// Load from specific file
let config = Config::from_file("config/custom.toml")?;
# Python
from stratarouter_runtime import Config
# Load with environment
config = Config.load("production")
# From file
config = Config.from_file("config/custom.toml")
Server Configuration¶
HTTP server settings.
[server]
host = "0.0.0.0" # Bind address
port = 8080 # HTTP port
workers = 4 # Number of worker threads
shutdown_timeout_seconds = 30 # Graceful shutdown timeout
Environment Variables:
Python:
Database Configuration¶
PostgreSQL database settings.
[database]
url = "postgresql://localhost/stratarouter"
max_connections = 20
min_connections = 2
acquire_timeout_seconds = 30
idle_timeout_seconds = 600
max_lifetime_seconds = 1800
Environment Variables:
export DATABASE_URL="postgresql://user:pass@localhost/stratarouter"
export DATABASE_MAX_CONNECTIONS="20"
export DATABASE_MIN_CONNECTIONS="2"
Connection Pool Sizing:
- Development: 2-5 connections
- Production: 10-50 connections (based on load)
- Formula: max_connections = (CPU cores × 2) + disk spindles
Cache Configuration¶
Semantic caching settings.
[cache]
enabled = true
max_entries = 10000
ttl_seconds = 3600 # 1 hour
persistent = true
cleanup_interval_seconds = 300 # 5 minutes
semantic_enabled = true
similarity_threshold = 0.8
similarity_method = "hybrid" # "hybrid" | "cosine" | "euclidean"
Environment Variables:
export CACHE_ENABLED="true"
export CACHE_TTL_SECONDS="3600"
export CACHE_SIMILARITY_THRESHOLD="0.8"
export REDIS_URL="redis://localhost:6379" # If using Redis backend
Python:
config = RuntimeConfig(
cache_enabled=True,
cache_max_entries=10000,
cache_ttl=3600,
cache_similarity_threshold=0.8
)
Tuning Guide:
| Workload | TTL | Max Entries | Similarity |
|---|---|---|---|
| Static content | 24h | 100K | 0.95 |
| Dynamic content | 1h | 10K | 0.85 |
| Real-time | 5m | 1K | 0.80 |
Batch Processing Configuration¶
Request batching and deduplication.
Environment Variables:
Python:
config = RuntimeConfig(
batch_enabled=True,
batch_max_size=100,
batch_timeout_ms=100,
batch_deduplication=True
)
Tuning:
- Latency-sensitive: batch_timeout_ms = 10-50
- Throughput-optimized: batch_timeout_ms = 100-500
- Max batch size: Based on provider limits (OpenAI: 2048 tokens/batch)
Resource Limits¶
Execution resource limits.
[limits]
max_concurrent_executions = 10
max_workflow_steps = 1000
max_workflow_depth = 10
max_steps_per_execution = 10000
max_execution_duration_seconds = 3600 # 1 hour
max_memory_mb = 24000.0 # 24 GB
max_cpu_cores = 4.0
max_workflows_per_tenant = 100
max_executions_per_day = 1000
allow_enterprise_features = false
Environment Variables:
export LIMITS_MAX_CONCURRENT_EXECUTIONS="10"
export LIMITS_MAX_EXECUTION_DURATION="3600"
export LIMITS_MAX_MEMORY_MB="24000"
Recommended Limits:
| Tier | Concurrent | Duration | Memory |
|---|---|---|---|
| Free | 1 | 300s | 1GB |
| Pro | 10 | 3600s | 8GB |
| Enterprise | 100 | 7200s | 32GB |
Routing Configuration¶
Semantic routing settings.
Python:
config = RuntimeConfig(
routing_default_agent="default-agent",
routing_threshold=0.7,
routing_max_alternatives=5
)
Logging Configuration¶
Structured logging settings.
[logging]
level = "info" # "trace" | "debug" | "info" | "warn" | "error"
format = "pretty" # "json" | "pretty" | "compact"
output = "stdout" # "stdout" | "stderr" | "file"
Environment Variables:
Log Levels: - trace: Very verbose, all operations - debug: Debug information, function calls - info: General information, startup/shutdown - warn: Warnings, deprecated usage - error: Errors only
Production Recommendation:
- Level: info or warn
- Format: json (structured)
- Output: stdout (captured by log aggregator)
Distributed Tracing¶
OpenTelemetry configuration.
[tracing]
enabled = true
service_name = "stratarouter-runtime"
sample_rate = 1.0 # 0.0-1.0 (1.0 = 100%)
Environment Variables:
export OTEL_SERVICE_NAME="stratarouter-runtime"
export OTEL_EXPORTER_OTLP_ENDPOINT="http://localhost:4317"
export OTEL_TRACES_SAMPLER="always_on" # or "traceidratio"
export OTEL_TRACES_SAMPLER_ARG="1.0" # Sample rate
Sampling Strategies:
| Environment | Sample Rate | Reasoning |
|---|---|---|
| Development | 1.0 | Trace everything |
| Staging | 1.0 | Full visibility |
| Production (low volume) | 1.0 | <10K req/day |
| Production (high volume) | 0.1 | >100K req/day |
Metrics Configuration¶
Prometheus metrics.
Environment Variables:
Accessing Metrics:
Key Metrics:
- stratarouter_runtime_executions_total - Total executions
- stratarouter_runtime_cache_hit_rate - Cache effectiveness
- stratarouter_runtime_latency_seconds - Execution latency
- stratarouter_runtime_cost_usd - Cost tracking
Health Checks¶
Service health monitoring.
Endpoints:
- GET /health - Basic liveness check
- GET /health/ready - Readiness check (includes dependencies)
Kubernetes:
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /health/ready
port: 8080
initialDelaySeconds: 5
periodSeconds: 5
State Management¶
Execution state persistence.
Environment Variables:
export STORAGE_SNAPSHOTS_ENABLED="true"
export STORAGE_SNAPSHOT_INTERVAL="100"
export STORAGE_RETENTION_DAYS="30"
Snapshot Strategy: - Interval: Checkpoint every N steps - Retention: Keep snapshots for N days - Cleanup: Automatic pruning of old snapshots
Rate Limiting¶
Token bucket rate limiting.
Environment Variables:
Algorithm: Token bucket
- Tokens refill at requests_per_second rate
- Burst capacity allows temporary spikes
- Rejected requests return HTTP 429
Tuning:
# Conservative (API protection)
requests_per_second = 10
burst_size = 20
# Moderate (normal usage)
requests_per_second = 100
burst_size = 200
# Aggressive (high throughput)
requests_per_second = 1000
burst_size = 2000
Quotas¶
Usage quotas and limits.
Environment Variables:
Enforcement: - Quotas are per-tenant/user - Exceeded quotas return HTTP 402 - Reset at period boundaries (UTC)
Complete Configuration Example¶
Production Configuration¶
# config/production.toml
[server]
host = "0.0.0.0"
port = 8080
workers = 8
shutdown_timeout_seconds = 30
[database]
url = "postgresql://user:pass@db.prod.example.com/stratarouter"
max_connections = 50
min_connections = 10
acquire_timeout_seconds = 30
idle_timeout_seconds = 600
max_lifetime_seconds = 1800
[cache]
enabled = true
max_entries = 100000
ttl_seconds = 3600
persistent = true
cleanup_interval_seconds = 300
semantic_enabled = true
similarity_threshold = 0.85
similarity_method = "hybrid"
[batch]
enabled = true
max_batch_size = 100
batch_timeout_ms = 50
enable_deduplication = true
[limits]
max_concurrent_executions = 100
max_workflow_steps = 10000
max_workflow_depth = 20
max_steps_per_execution = 100000
max_execution_duration_seconds = 7200
max_memory_mb = 32000.0
max_cpu_cores = 16.0
max_workflows_per_tenant = 1000
max_executions_per_day = 100000
allow_enterprise_features = true
[routing]
default_agent = "gpt-4"
semantic_threshold = 0.75
max_alternatives = 10
[logging]
level = "warn"
format = "json"
output = "stdout"
[tracing]
enabled = true
service_name = "stratarouter-runtime-prod"
sample_rate = 0.1
[metrics]
enabled = true
port = 9090
endpoint = "/metrics"
[health]
check_interval_seconds = 30
timeout_seconds = 5
[storage]
snapshots_enabled = true
snapshot_interval_steps = 100
retention_days = 90
[rate_limit]
requests_per_second = 1000
burst_size = 2000
[quotas]
hourly_execution_limit = 10000
daily_execution_limit = 100000
monthly_execution_limit = 3000000
Environment Variables¶
# .env.production
# Database
DATABASE_URL=postgresql://user:pass@db.prod.example.com/stratarouter
DATABASE_MAX_CONNECTIONS=50
# Cache
REDIS_URL=redis://cache.prod.example.com:6379
CACHE_TTL_SECONDS=3600
# Observability
RUST_LOG=warn
OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector:4317
OTEL_SERVICE_NAME=stratarouter-runtime-prod
# Rate Limiting
RATE_LIMIT_RPS=1000
RATE_LIMIT_BURST=2000
# Server
SERVER_PORT=8080
METRICS_PORT=9090
Configuration Validation¶
Validate configuration before deployment:
# Rust
cargo run --bin validate-config -- config/production.toml
# Python
python -m stratarouter.validate_config config/production.toml
Common Validation Errors: - Invalid database URL format - Port already in use - Missing required environment variables - Invalid threshold values (must be 0-1) - Negative timeout values
Best Practices¶
Security¶
- ✅ Never commit secrets to version control
- ✅ Use environment variables for sensitive data
- ✅ Rotate database credentials regularly
- ✅ Use TLS for database connections in production
Performance¶
- ✅ Tune connection pools based on load
- ✅ Enable caching in production
- ✅ Use batch processing for high throughput
- ✅ Monitor and adjust rate limits
Observability¶
- ✅ Use structured logging (JSON) in production
- ✅ Enable distributed tracing
- ✅ Export metrics to Prometheus
- ✅ Set up alerts for key metrics
Reliability¶
- ✅ Configure appropriate timeouts
- ✅ Enable state snapshots
- ✅ Set resource limits
- ✅ Configure health checks for k8s