Production Deployment¶

Deploy StrataRouter to production with confidence using our proven deployment patterns and infrastructure configurations.

Quick Start¶

Deploy in under 10 minutes using Docker Compose:

# Clone repository
git clone https://github.com/stratarouter/stratarouter
cd stratarouter

# Configure environment
cp .env.example .env
# Edit .env with your settings

# Start services
docker-compose up -d

# Verify deployment
curl http://localhost:8000/health

Deployment Options¶

Docker (Recommended)¶

Production-ready containerized deployment with all dependencies.

# docker-compose.yml
version: '3.8'

services:
  stratarouter:
    image: stratarouter/stratarouter:latest
    ports:
      - "8000:8000"
      - "9090:9090"  # Metrics
    environment:
      - RUST_LOG=info
      - DATABASE_URL=postgresql://user:pass@db:5432/stratarouter
      - REDIS_URL=redis://redis:6379
      - OPENAI_API_KEY=${OPENAI_API_KEY}
    volumes:
      - ./config:/app/config
      - ./data:/app/data
    restart: unless-stopped
    depends_on:
      - db
      - redis
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
      interval: 30s
      timeout: 10s
      retries: 3

  db:
    image: postgres:15-alpine
    environment:
      POSTGRES_DB: stratarouter
      POSTGRES_USER: stratarouter
      POSTGRES_PASSWORD: ${DB_PASSWORD}
    volumes:
      - postgres_data:/var/lib/postgresql/data
    restart: unless-stopped

  redis:
    image: redis:7-alpine
    command: redis-server --appendonly yes
    volumes:
      - redis_data:/data
    restart: unless-stopped

  prometheus:
    image: prom/prometheus:latest
    ports:
      - "9091:9090"
    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml
      - prometheus_data:/prometheus
    restart: unless-stopped

volumes:
  postgres_data:
  redis_data:
  prometheus_data:

Kubernetes¶

Enterprise-grade Kubernetes deployment with horizontal autoscaling.

# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: stratarouter
  labels:
    app: stratarouter
spec:
  replicas: 3
  selector:
    matchLabels:
      app: stratarouter
  template:
    metadata:
      labels:
        app: stratarouter
    spec:
      containers:
      - name: stratarouter
        image: stratarouter/stratarouter:1.0.0
        ports:
        - containerPort: 8000
          name: http
        - containerPort: 9090
          name: metrics
        env:
        - name: DATABASE_URL
          valueFrom:
            secretKeyRef:
              name: stratarouter-secrets
              key: database-url
        - name: REDIS_URL
          valueFrom:
            configMapKeyRef:
              name: stratarouter-config
              key: redis-url
        resources:
          requests:
            memory: "256Mi"
            cpu: "250m"
          limits:
            memory: "1Gi"
            cpu: "1000m"
        livenessProbe:
          httpGet:
            path: /health
            port: 8000
          initialDelaySeconds: 30
          periodSeconds: 10
        readinessProbe:
          httpGet:
            path: /ready
            port: 8000
          initialDelaySeconds: 10
          periodSeconds: 5

---
apiVersion: v1
kind: Service
metadata:
  name: stratarouter
spec:
  selector:
    app: stratarouter
  ports:
  - port: 80
    targetPort: 8000
    name: http
  - port: 9090
    targetPort: 9090
    name: metrics
  type: LoadBalancer

---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: stratarouter-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: stratarouter
  minReplicas: 3
  maxReplicas: 20
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80

AWS ECS¶

Managed container deployment on AWS.

{
  "family": "stratarouter",
  "networkMode": "awsvpc",
  "requiresCompatibilities": ["FARGATE"],
  "cpu": "1024",
  "memory": "2048",
  "containerDefinitions": [
    {
      "name": "stratarouter",
      "image": "stratarouter/stratarouter:latest",
      "portMappings": [
        {
          "containerPort": 8000,
          "protocol": "tcp"
        }
      ],
      "environment": [
        {
          "name": "RUST_LOG",
          "value": "info"
        }
      ],
      "secrets": [
        {
          "name": "DATABASE_URL",
          "valueFrom": "arn:aws:secretsmanager:region:account:secret:db-url"
        },
        {
          "name": "OPENAI_API_KEY",
          "valueFrom": "arn:aws:secretsmanager:region:account:secret:openai-key"
        }
      ],
      "logConfiguration": {
        "logDriver": "awslogs",
        "options": {
          "awslogs-group": "/ecs/stratarouter",
          "awslogs-region": "us-east-1",
          "awslogs-stream-prefix": "ecs"
        }
      },
      "healthCheck": {
        "command": ["CMD-SHELL", "curl -f http://localhost:8000/health || exit 1"],
        "interval": 30,
        "timeout": 5,
        "retries": 3
      }
    }
  ]
}

Infrastructure Requirements¶

Minimum Specifications¶

For production workloads up to 1000 requests/second:

Resource	Minimum	Recommended
CPU	2 cores	4 cores
Memory	2 GB	4 GB
Storage	20 GB SSD	50 GB SSD
Network	100 Mbps	1 Gbps

Database¶

PostgreSQL 13+ for state persistence:

Connection Pool: 20-50 connections
Storage: 10GB minimum, grows with usage
Backups: Daily automated backups
Replication: Recommended for HA

Redis¶

For semantic caching:

Memory: 1-4GB depending on cache size
Persistence: AOF enabled
Eviction: allkeys-lru policy
Max Memory: 80% of available RAM

Configuration¶

Environment Variables¶

# Core Settings
RUST_LOG=info                    # Logging level
HOST=0.0.0.0                     # Bind address
PORT=8000                        # HTTP port
METRICS_PORT=9090                # Prometheus metrics

# Database
DATABASE_URL=postgresql://user:pass@host:5432/db
DATABASE_MAX_CONNECTIONS=20
DATABASE_MIN_CONNECTIONS=5

# Redis Cache
REDIS_URL=redis://host:6379
CACHE_TTL_SECONDS=3600
CACHE_MAX_SIZE_MB=1024

# LLM Providers
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
GOOGLE_API_KEY=...

# Performance
MAX_CONCURRENT_REQUESTS=1000
REQUEST_TIMEOUT_SECONDS=30
BATCH_SIZE=50

# Security
JWT_SECRET=your-secret-key
CORS_ALLOWED_ORIGINS=https://app.example.com
RATE_LIMIT_PER_MINUTE=1000

# Observability
JAEGER_ENDPOINT=http://jaeger:14268/api/traces
PROMETHEUS_ENABLED=true

Production Config File¶

# config/production.toml
[server]
host = "0.0.0.0"
port = 8000
workers = 4

[database]
url = "postgresql://stratarouter:password@db:5432/stratarouter"
max_connections = 20
min_connections = 5
connection_timeout = 30

[cache]
enabled = true
redis_url = "redis://redis:6379"
ttl_seconds = 3600
max_size_mb = 1024

[routing]
default_threshold = 0.5
max_routes = 1000
embedding_dimension = 384

[observability]
metrics_enabled = true
tracing_enabled = true
log_level = "info"

[security]
cors_enabled = true
allowed_origins = ["https://app.example.com"]
rate_limit_per_minute = 1000

[limits]
max_concurrent_requests = 1000
request_timeout_seconds = 30
max_request_size_mb = 10

Health Checks¶

StrataRouter exposes multiple health check endpoints:

# Liveness - Is service running?
curl http://localhost:8000/health

# Readiness - Can service handle traffic?
curl http://localhost:8000/ready

# Metrics - Prometheus format
curl http://localhost:9090/metrics

Response examples:

// /health
{
  "status": "healthy",
  "version": "1.0.0",
  "uptime_seconds": 86400
}

// /ready
{
  "ready": true,
  "checks": {
    "database": "ok",
    "redis": "ok",
    "providers": "ok"
  }
}

Load Balancing¶

NGINX Configuration¶

upstream stratarouter {
    least_conn;
    server router1:8000 max_fails=3 fail_timeout=30s;
    server router2:8000 max_fails=3 fail_timeout=30s;
    server router3:8000 max_fails=3 fail_timeout=30s;
}

server {
    listen 80;
    server_name api.example.com;

    location / {
        proxy_pass http://stratarouter;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;

        # Timeouts
        proxy_connect_timeout 5s;
        proxy_send_timeout 30s;
        proxy_read_timeout 30s;

        # Headers
        proxy_http_version 1.1;
        proxy_set_header Connection "";

        # Health checks
        proxy_next_upstream error timeout http_502 http_503 http_504;
    }

    location /metrics {
        deny all;
        return 403;
    }
}

HAProxy Configuration¶

global
    maxconn 4096
    log stdout format raw local0

defaults
    mode http
    timeout connect 5s
    timeout client 30s
    timeout server 30s
    option httplog

frontend stratarouter_frontend
    bind *:80
    default_backend stratarouter_backend

backend stratarouter_backend
    balance leastconn
    option httpchk GET /health
    http-check expect status 200

    server router1 router1:8000 check inter 10s
    server router2 router2:8000 check inter 10s
    server router3 router3:8000 check inter 10s

SSL/TLS Configuration¶

Let's Encrypt with Certbot¶

# Install certbot
apt-get install certbot python3-certbot-nginx

# Obtain certificate
certbot --nginx -d api.example.com

# Auto-renewal
certbot renew --dry-run

NGINX SSL Configuration¶

server {
    listen 443 ssl http2;
    server_name api.example.com;

    ssl_certificate /etc/letsencrypt/live/api.example.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/api.example.com/privkey.pem;

    ssl_protocols TLSv1.2 TLSv1.3;
    ssl_ciphers HIGH:!aNULL:!MD5;
    ssl_prefer_server_ciphers on;

    # HSTS
    add_header Strict-Transport-Security "max-age=31536000" always;

    location / {
        proxy_pass http://stratarouter;
    }
}

Backup Strategy¶

Database Backups¶

#!/bin/bash
# backup-db.sh

DATE=$(date +%Y%m%d_%H%M%S)
BACKUP_DIR="/backups/postgres"

# Create backup
docker exec stratarouter_db pg_dump \
  -U stratarouter \
  -F c \
  stratarouter > "$BACKUP_DIR/backup_$DATE.dump"

# Compress
gzip "$BACKUP_DIR/backup_$DATE.dump"

# Keep last 30 days
find "$BACKUP_DIR" -name "*.dump.gz" -mtime +30 -delete

# Upload to S3
aws s3 cp "$BACKUP_DIR/backup_$DATE.dump.gz" \
  s3://backups/stratarouter/

Redis Persistence¶

Configure AOF and RDB:

# redis.conf
appendonly yes
appendfsync everysec
save 900 1
save 300 10
save 60 10000

Disaster Recovery¶

Recovery Time Objectives¶

Component	RTO	RPO
Application	5 minutes	0 minutes
Database	15 minutes	1 hour
Cache	1 minute	N/A

Recovery Procedures¶

Database Restoration¶

# Stop application
docker-compose stop stratarouter

# Restore from backup
docker exec -i stratarouter_db pg_restore \
  -U stratarouter \
  -d stratarouter \
  -c < backup_20260111.dump

# Verify data
docker exec stratarouter_db psql -U stratarouter -c "SELECT COUNT(*) FROM routes;"

# Restart application
docker-compose start stratarouter

Full System Recovery¶

# 1. Provision new infrastructure
terraform apply

# 2. Restore database
./restore-db.sh latest

# 3. Deploy application
docker-compose up -d

# 4. Verify health
curl http://new-host:8000/health

# 5. Update DNS
# Point api.example.com to new host

Monitoring¶

Key Metrics to Monitor¶

Metric	Alert Threshold
Request Latency P99	> 50ms
Error Rate	> 1%
CPU Usage	> 80%
Memory Usage	> 85%
Database Connections	> 80% of pool
Cache Hit Rate	< 70%
Disk Usage	> 80%

Alerting Rules¶

# prometheus-alerts.yml
groups:
  - name: stratarouter
    interval: 30s
    rules:
      - alert: HighLatency
        expr: histogram_quantile(0.99, rate(http_request_duration_seconds_bucket[5m])) > 0.050
        for: 5m
        annotations:
          summary: "High latency detected"
          description: "P99 latency is {{ $value }}s"

      - alert: HighErrorRate
        expr: rate(http_requests_total{status=~"5.."}[5m]) / rate(http_requests_total[5m]) > 0.01
        for: 5m
        annotations:
          summary: "High error rate detected"

      - alert: DatabaseConnectionPoolExhausted
        expr: database_connections_active / database_connections_max > 0.8
        for: 2m
        annotations:
          summary: "Database connection pool nearly exhausted"

Next Steps¶

Review Configuration Guide for advanced tuning
Set up Monitoring and alerting
Configure Scaling policies
Implement Security best practices
Plan Disaster Recovery procedures

Production Checklist¶

Before going live: