Production Deployment¶
Deploy StrataRouter to production with confidence using our proven deployment patterns and infrastructure configurations.
Quick Start¶
Deploy in under 10 minutes using Docker Compose:
# Clone repository
git clone https://github.com/stratarouter/stratarouter
cd stratarouter
# Configure environment
cp .env.example .env
# Edit .env with your settings
# Start services
docker-compose up -d
# Verify deployment
curl http://localhost:8000/health
Deployment Options¶
Docker (Recommended)¶
Production-ready containerized deployment with all dependencies.
# docker-compose.yml
version: '3.8'
services:
stratarouter:
image: stratarouter/stratarouter:latest
ports:
- "8000:8000"
- "9090:9090" # Metrics
environment:
- RUST_LOG=info
- DATABASE_URL=postgresql://user:pass@db:5432/stratarouter
- REDIS_URL=redis://redis:6379
- OPENAI_API_KEY=${OPENAI_API_KEY}
volumes:
- ./config:/app/config
- ./data:/app/data
restart: unless-stopped
depends_on:
- db
- redis
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
interval: 30s
timeout: 10s
retries: 3
db:
image: postgres:15-alpine
environment:
POSTGRES_DB: stratarouter
POSTGRES_USER: stratarouter
POSTGRES_PASSWORD: ${DB_PASSWORD}
volumes:
- postgres_data:/var/lib/postgresql/data
restart: unless-stopped
redis:
image: redis:7-alpine
command: redis-server --appendonly yes
volumes:
- redis_data:/data
restart: unless-stopped
prometheus:
image: prom/prometheus:latest
ports:
- "9091:9090"
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
- prometheus_data:/prometheus
restart: unless-stopped
volumes:
postgres_data:
redis_data:
prometheus_data:
Kubernetes¶
Enterprise-grade Kubernetes deployment with horizontal autoscaling.
# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: stratarouter
labels:
app: stratarouter
spec:
replicas: 3
selector:
matchLabels:
app: stratarouter
template:
metadata:
labels:
app: stratarouter
spec:
containers:
- name: stratarouter
image: stratarouter/stratarouter:1.0.0
ports:
- containerPort: 8000
name: http
- containerPort: 9090
name: metrics
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: stratarouter-secrets
key: database-url
- name: REDIS_URL
valueFrom:
configMapKeyRef:
name: stratarouter-config
key: redis-url
resources:
requests:
memory: "256Mi"
cpu: "250m"
limits:
memory: "1Gi"
cpu: "1000m"
livenessProbe:
httpGet:
path: /health
port: 8000
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /ready
port: 8000
initialDelaySeconds: 10
periodSeconds: 5
---
apiVersion: v1
kind: Service
metadata:
name: stratarouter
spec:
selector:
app: stratarouter
ports:
- port: 80
targetPort: 8000
name: http
- port: 9090
targetPort: 9090
name: metrics
type: LoadBalancer
---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: stratarouter-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: stratarouter
minReplicas: 3
maxReplicas: 20
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
AWS ECS¶
Managed container deployment on AWS.
{
"family": "stratarouter",
"networkMode": "awsvpc",
"requiresCompatibilities": ["FARGATE"],
"cpu": "1024",
"memory": "2048",
"containerDefinitions": [
{
"name": "stratarouter",
"image": "stratarouter/stratarouter:latest",
"portMappings": [
{
"containerPort": 8000,
"protocol": "tcp"
}
],
"environment": [
{
"name": "RUST_LOG",
"value": "info"
}
],
"secrets": [
{
"name": "DATABASE_URL",
"valueFrom": "arn:aws:secretsmanager:region:account:secret:db-url"
},
{
"name": "OPENAI_API_KEY",
"valueFrom": "arn:aws:secretsmanager:region:account:secret:openai-key"
}
],
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "/ecs/stratarouter",
"awslogs-region": "us-east-1",
"awslogs-stream-prefix": "ecs"
}
},
"healthCheck": {
"command": ["CMD-SHELL", "curl -f http://localhost:8000/health || exit 1"],
"interval": 30,
"timeout": 5,
"retries": 3
}
}
]
}
Infrastructure Requirements¶
Minimum Specifications¶
For production workloads up to 1000 requests/second:
| Resource | Minimum | Recommended |
|---|---|---|
| CPU | 2 cores | 4 cores |
| Memory | 2 GB | 4 GB |
| Storage | 20 GB SSD | 50 GB SSD |
| Network | 100 Mbps | 1 Gbps |
Database¶
PostgreSQL 13+ for state persistence:
- Connection Pool: 20-50 connections
- Storage: 10GB minimum, grows with usage
- Backups: Daily automated backups
- Replication: Recommended for HA
Redis¶
For semantic caching:
- Memory: 1-4GB depending on cache size
- Persistence: AOF enabled
- Eviction:
allkeys-lrupolicy - Max Memory: 80% of available RAM
Configuration¶
Environment Variables¶
# Core Settings
RUST_LOG=info # Logging level
HOST=0.0.0.0 # Bind address
PORT=8000 # HTTP port
METRICS_PORT=9090 # Prometheus metrics
# Database
DATABASE_URL=postgresql://user:pass@host:5432/db
DATABASE_MAX_CONNECTIONS=20
DATABASE_MIN_CONNECTIONS=5
# Redis Cache
REDIS_URL=redis://host:6379
CACHE_TTL_SECONDS=3600
CACHE_MAX_SIZE_MB=1024
# LLM Providers
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
GOOGLE_API_KEY=...
# Performance
MAX_CONCURRENT_REQUESTS=1000
REQUEST_TIMEOUT_SECONDS=30
BATCH_SIZE=50
# Security
JWT_SECRET=your-secret-key
CORS_ALLOWED_ORIGINS=https://app.example.com
RATE_LIMIT_PER_MINUTE=1000
# Observability
JAEGER_ENDPOINT=http://jaeger:14268/api/traces
PROMETHEUS_ENABLED=true
Production Config File¶
# config/production.toml
[server]
host = "0.0.0.0"
port = 8000
workers = 4
[database]
url = "postgresql://stratarouter:password@db:5432/stratarouter"
max_connections = 20
min_connections = 5
connection_timeout = 30
[cache]
enabled = true
redis_url = "redis://redis:6379"
ttl_seconds = 3600
max_size_mb = 1024
[routing]
default_threshold = 0.5
max_routes = 1000
embedding_dimension = 384
[observability]
metrics_enabled = true
tracing_enabled = true
log_level = "info"
[security]
cors_enabled = true
allowed_origins = ["https://app.example.com"]
rate_limit_per_minute = 1000
[limits]
max_concurrent_requests = 1000
request_timeout_seconds = 30
max_request_size_mb = 10
Health Checks¶
StrataRouter exposes multiple health check endpoints:
# Liveness - Is service running?
curl http://localhost:8000/health
# Readiness - Can service handle traffic?
curl http://localhost:8000/ready
# Metrics - Prometheus format
curl http://localhost:9090/metrics
Response examples:
// /health
{
"status": "healthy",
"version": "1.0.0",
"uptime_seconds": 86400
}
// /ready
{
"ready": true,
"checks": {
"database": "ok",
"redis": "ok",
"providers": "ok"
}
}
Load Balancing¶
NGINX Configuration¶
upstream stratarouter {
least_conn;
server router1:8000 max_fails=3 fail_timeout=30s;
server router2:8000 max_fails=3 fail_timeout=30s;
server router3:8000 max_fails=3 fail_timeout=30s;
}
server {
listen 80;
server_name api.example.com;
location / {
proxy_pass http://stratarouter;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
# Timeouts
proxy_connect_timeout 5s;
proxy_send_timeout 30s;
proxy_read_timeout 30s;
# Headers
proxy_http_version 1.1;
proxy_set_header Connection "";
# Health checks
proxy_next_upstream error timeout http_502 http_503 http_504;
}
location /metrics {
deny all;
return 403;
}
}
HAProxy Configuration¶
global
maxconn 4096
log stdout format raw local0
defaults
mode http
timeout connect 5s
timeout client 30s
timeout server 30s
option httplog
frontend stratarouter_frontend
bind *:80
default_backend stratarouter_backend
backend stratarouter_backend
balance leastconn
option httpchk GET /health
http-check expect status 200
server router1 router1:8000 check inter 10s
server router2 router2:8000 check inter 10s
server router3 router3:8000 check inter 10s
SSL/TLS Configuration¶
Let's Encrypt with Certbot¶
# Install certbot
apt-get install certbot python3-certbot-nginx
# Obtain certificate
certbot --nginx -d api.example.com
# Auto-renewal
certbot renew --dry-run
NGINX SSL Configuration¶
server {
listen 443 ssl http2;
server_name api.example.com;
ssl_certificate /etc/letsencrypt/live/api.example.com/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/api.example.com/privkey.pem;
ssl_protocols TLSv1.2 TLSv1.3;
ssl_ciphers HIGH:!aNULL:!MD5;
ssl_prefer_server_ciphers on;
# HSTS
add_header Strict-Transport-Security "max-age=31536000" always;
location / {
proxy_pass http://stratarouter;
}
}
Backup Strategy¶
Database Backups¶
#!/bin/bash
# backup-db.sh
DATE=$(date +%Y%m%d_%H%M%S)
BACKUP_DIR="/backups/postgres"
# Create backup
docker exec stratarouter_db pg_dump \
-U stratarouter \
-F c \
stratarouter > "$BACKUP_DIR/backup_$DATE.dump"
# Compress
gzip "$BACKUP_DIR/backup_$DATE.dump"
# Keep last 30 days
find "$BACKUP_DIR" -name "*.dump.gz" -mtime +30 -delete
# Upload to S3
aws s3 cp "$BACKUP_DIR/backup_$DATE.dump.gz" \
s3://backups/stratarouter/
Redis Persistence¶
Configure AOF and RDB:
Disaster Recovery¶
Recovery Time Objectives¶
| Component | RTO | RPO |
|---|---|---|
| Application | 5 minutes | 0 minutes |
| Database | 15 minutes | 1 hour |
| Cache | 1 minute | N/A |
Recovery Procedures¶
Database Restoration¶
# Stop application
docker-compose stop stratarouter
# Restore from backup
docker exec -i stratarouter_db pg_restore \
-U stratarouter \
-d stratarouter \
-c < backup_20260111.dump
# Verify data
docker exec stratarouter_db psql -U stratarouter -c "SELECT COUNT(*) FROM routes;"
# Restart application
docker-compose start stratarouter
Full System Recovery¶
# 1. Provision new infrastructure
terraform apply
# 2. Restore database
./restore-db.sh latest
# 3. Deploy application
docker-compose up -d
# 4. Verify health
curl http://new-host:8000/health
# 5. Update DNS
# Point api.example.com to new host
Monitoring¶
Key Metrics to Monitor¶
| Metric | Alert Threshold |
|---|---|
| Request Latency P99 | > 50ms |
| Error Rate | > 1% |
| CPU Usage | > 80% |
| Memory Usage | > 85% |
| Database Connections | > 80% of pool |
| Cache Hit Rate | < 70% |
| Disk Usage | > 80% |
Alerting Rules¶
# prometheus-alerts.yml
groups:
- name: stratarouter
interval: 30s
rules:
- alert: HighLatency
expr: histogram_quantile(0.99, rate(http_request_duration_seconds_bucket[5m])) > 0.050
for: 5m
annotations:
summary: "High latency detected"
description: "P99 latency is {{ $value }}s"
- alert: HighErrorRate
expr: rate(http_requests_total{status=~"5.."}[5m]) / rate(http_requests_total[5m]) > 0.01
for: 5m
annotations:
summary: "High error rate detected"
- alert: DatabaseConnectionPoolExhausted
expr: database_connections_active / database_connections_max > 0.8
for: 2m
annotations:
summary: "Database connection pool nearly exhausted"
Next Steps¶
- Review Configuration Guide for advanced tuning
- Set up Monitoring and alerting
- Configure Scaling policies
- Implement Security best practices
- Plan Disaster Recovery procedures
Production Checklist¶
Before going live:
- SSL/TLS certificates configured
- Database backups automated
- Monitoring and alerting active
- Load testing completed
- Security audit passed
- DNS configured with failover
- Disaster recovery plan tested
- Documentation updated
- Team trained on operations
- Runbooks created for common issues