Skip to content

Production Deployment

Deploy StrataRouter to production with confidence using our proven deployment patterns and infrastructure configurations.

Quick Start

Deploy in under 10 minutes using Docker Compose:

# Clone repository
git clone https://github.com/stratarouter/stratarouter
cd stratarouter

# Configure environment
cp .env.example .env
# Edit .env with your settings

# Start services
docker-compose up -d

# Verify deployment
curl http://localhost:8000/health

Deployment Options

Production-ready containerized deployment with all dependencies.

# docker-compose.yml
version: '3.8'

services:
  stratarouter:
    image: stratarouter/stratarouter:latest
    ports:
      - "8000:8000"
      - "9090:9090"  # Metrics
    environment:
      - RUST_LOG=info
      - DATABASE_URL=postgresql://user:pass@db:5432/stratarouter
      - REDIS_URL=redis://redis:6379
      - OPENAI_API_KEY=${OPENAI_API_KEY}
    volumes:
      - ./config:/app/config
      - ./data:/app/data
    restart: unless-stopped
    depends_on:
      - db
      - redis
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
      interval: 30s
      timeout: 10s
      retries: 3

  db:
    image: postgres:15-alpine
    environment:
      POSTGRES_DB: stratarouter
      POSTGRES_USER: stratarouter
      POSTGRES_PASSWORD: ${DB_PASSWORD}
    volumes:
      - postgres_data:/var/lib/postgresql/data
    restart: unless-stopped

  redis:
    image: redis:7-alpine
    command: redis-server --appendonly yes
    volumes:
      - redis_data:/data
    restart: unless-stopped

  prometheus:
    image: prom/prometheus:latest
    ports:
      - "9091:9090"
    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml
      - prometheus_data:/prometheus
    restart: unless-stopped

volumes:
  postgres_data:
  redis_data:
  prometheus_data:

Kubernetes

Enterprise-grade Kubernetes deployment with horizontal autoscaling.

# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: stratarouter
  labels:
    app: stratarouter
spec:
  replicas: 3
  selector:
    matchLabels:
      app: stratarouter
  template:
    metadata:
      labels:
        app: stratarouter
    spec:
      containers:
      - name: stratarouter
        image: stratarouter/stratarouter:1.0.0
        ports:
        - containerPort: 8000
          name: http
        - containerPort: 9090
          name: metrics
        env:
        - name: DATABASE_URL
          valueFrom:
            secretKeyRef:
              name: stratarouter-secrets
              key: database-url
        - name: REDIS_URL
          valueFrom:
            configMapKeyRef:
              name: stratarouter-config
              key: redis-url
        resources:
          requests:
            memory: "256Mi"
            cpu: "250m"
          limits:
            memory: "1Gi"
            cpu: "1000m"
        livenessProbe:
          httpGet:
            path: /health
            port: 8000
          initialDelaySeconds: 30
          periodSeconds: 10
        readinessProbe:
          httpGet:
            path: /ready
            port: 8000
          initialDelaySeconds: 10
          periodSeconds: 5

---
apiVersion: v1
kind: Service
metadata:
  name: stratarouter
spec:
  selector:
    app: stratarouter
  ports:
  - port: 80
    targetPort: 8000
    name: http
  - port: 9090
    targetPort: 9090
    name: metrics
  type: LoadBalancer

---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: stratarouter-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: stratarouter
  minReplicas: 3
  maxReplicas: 20
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80

AWS ECS

Managed container deployment on AWS.

{
  "family": "stratarouter",
  "networkMode": "awsvpc",
  "requiresCompatibilities": ["FARGATE"],
  "cpu": "1024",
  "memory": "2048",
  "containerDefinitions": [
    {
      "name": "stratarouter",
      "image": "stratarouter/stratarouter:latest",
      "portMappings": [
        {
          "containerPort": 8000,
          "protocol": "tcp"
        }
      ],
      "environment": [
        {
          "name": "RUST_LOG",
          "value": "info"
        }
      ],
      "secrets": [
        {
          "name": "DATABASE_URL",
          "valueFrom": "arn:aws:secretsmanager:region:account:secret:db-url"
        },
        {
          "name": "OPENAI_API_KEY",
          "valueFrom": "arn:aws:secretsmanager:region:account:secret:openai-key"
        }
      ],
      "logConfiguration": {
        "logDriver": "awslogs",
        "options": {
          "awslogs-group": "/ecs/stratarouter",
          "awslogs-region": "us-east-1",
          "awslogs-stream-prefix": "ecs"
        }
      },
      "healthCheck": {
        "command": ["CMD-SHELL", "curl -f http://localhost:8000/health || exit 1"],
        "interval": 30,
        "timeout": 5,
        "retries": 3
      }
    }
  ]
}

Infrastructure Requirements

Minimum Specifications

For production workloads up to 1000 requests/second:

Resource Minimum Recommended
CPU 2 cores 4 cores
Memory 2 GB 4 GB
Storage 20 GB SSD 50 GB SSD
Network 100 Mbps 1 Gbps

Database

PostgreSQL 13+ for state persistence:

  • Connection Pool: 20-50 connections
  • Storage: 10GB minimum, grows with usage
  • Backups: Daily automated backups
  • Replication: Recommended for HA

Redis

For semantic caching:

  • Memory: 1-4GB depending on cache size
  • Persistence: AOF enabled
  • Eviction: allkeys-lru policy
  • Max Memory: 80% of available RAM

Configuration

Environment Variables

# Core Settings
RUST_LOG=info                    # Logging level
HOST=0.0.0.0                     # Bind address
PORT=8000                        # HTTP port
METRICS_PORT=9090                # Prometheus metrics

# Database
DATABASE_URL=postgresql://user:pass@host:5432/db
DATABASE_MAX_CONNECTIONS=20
DATABASE_MIN_CONNECTIONS=5

# Redis Cache
REDIS_URL=redis://host:6379
CACHE_TTL_SECONDS=3600
CACHE_MAX_SIZE_MB=1024

# LLM Providers
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
GOOGLE_API_KEY=...

# Performance
MAX_CONCURRENT_REQUESTS=1000
REQUEST_TIMEOUT_SECONDS=30
BATCH_SIZE=50

# Security
JWT_SECRET=your-secret-key
CORS_ALLOWED_ORIGINS=https://app.example.com
RATE_LIMIT_PER_MINUTE=1000

# Observability
JAEGER_ENDPOINT=http://jaeger:14268/api/traces
PROMETHEUS_ENABLED=true

Production Config File

# config/production.toml
[server]
host = "0.0.0.0"
port = 8000
workers = 4

[database]
url = "postgresql://stratarouter:password@db:5432/stratarouter"
max_connections = 20
min_connections = 5
connection_timeout = 30

[cache]
enabled = true
redis_url = "redis://redis:6379"
ttl_seconds = 3600
max_size_mb = 1024

[routing]
default_threshold = 0.5
max_routes = 1000
embedding_dimension = 384

[observability]
metrics_enabled = true
tracing_enabled = true
log_level = "info"

[security]
cors_enabled = true
allowed_origins = ["https://app.example.com"]
rate_limit_per_minute = 1000

[limits]
max_concurrent_requests = 1000
request_timeout_seconds = 30
max_request_size_mb = 10

Health Checks

StrataRouter exposes multiple health check endpoints:

# Liveness - Is service running?
curl http://localhost:8000/health

# Readiness - Can service handle traffic?
curl http://localhost:8000/ready

# Metrics - Prometheus format
curl http://localhost:9090/metrics

Response examples:

// /health
{
  "status": "healthy",
  "version": "1.0.0",
  "uptime_seconds": 86400
}

// /ready
{
  "ready": true,
  "checks": {
    "database": "ok",
    "redis": "ok",
    "providers": "ok"
  }
}

Load Balancing

NGINX Configuration

upstream stratarouter {
    least_conn;
    server router1:8000 max_fails=3 fail_timeout=30s;
    server router2:8000 max_fails=3 fail_timeout=30s;
    server router3:8000 max_fails=3 fail_timeout=30s;
}

server {
    listen 80;
    server_name api.example.com;

    location / {
        proxy_pass http://stratarouter;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;

        # Timeouts
        proxy_connect_timeout 5s;
        proxy_send_timeout 30s;
        proxy_read_timeout 30s;

        # Headers
        proxy_http_version 1.1;
        proxy_set_header Connection "";

        # Health checks
        proxy_next_upstream error timeout http_502 http_503 http_504;
    }

    location /metrics {
        deny all;
        return 403;
    }
}

HAProxy Configuration

global
    maxconn 4096
    log stdout format raw local0

defaults
    mode http
    timeout connect 5s
    timeout client 30s
    timeout server 30s
    option httplog

frontend stratarouter_frontend
    bind *:80
    default_backend stratarouter_backend

backend stratarouter_backend
    balance leastconn
    option httpchk GET /health
    http-check expect status 200

    server router1 router1:8000 check inter 10s
    server router2 router2:8000 check inter 10s
    server router3 router3:8000 check inter 10s

SSL/TLS Configuration

Let's Encrypt with Certbot

# Install certbot
apt-get install certbot python3-certbot-nginx

# Obtain certificate
certbot --nginx -d api.example.com

# Auto-renewal
certbot renew --dry-run

NGINX SSL Configuration

server {
    listen 443 ssl http2;
    server_name api.example.com;

    ssl_certificate /etc/letsencrypt/live/api.example.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/api.example.com/privkey.pem;

    ssl_protocols TLSv1.2 TLSv1.3;
    ssl_ciphers HIGH:!aNULL:!MD5;
    ssl_prefer_server_ciphers on;

    # HSTS
    add_header Strict-Transport-Security "max-age=31536000" always;

    location / {
        proxy_pass http://stratarouter;
    }
}

Backup Strategy

Database Backups

#!/bin/bash
# backup-db.sh

DATE=$(date +%Y%m%d_%H%M%S)
BACKUP_DIR="/backups/postgres"

# Create backup
docker exec stratarouter_db pg_dump \
  -U stratarouter \
  -F c \
  stratarouter > "$BACKUP_DIR/backup_$DATE.dump"

# Compress
gzip "$BACKUP_DIR/backup_$DATE.dump"

# Keep last 30 days
find "$BACKUP_DIR" -name "*.dump.gz" -mtime +30 -delete

# Upload to S3
aws s3 cp "$BACKUP_DIR/backup_$DATE.dump.gz" \
  s3://backups/stratarouter/

Redis Persistence

Configure AOF and RDB:

# redis.conf
appendonly yes
appendfsync everysec
save 900 1
save 300 10
save 60 10000

Disaster Recovery

Recovery Time Objectives

Component RTO RPO
Application 5 minutes 0 minutes
Database 15 minutes 1 hour
Cache 1 minute N/A

Recovery Procedures

Database Restoration

# Stop application
docker-compose stop stratarouter

# Restore from backup
docker exec -i stratarouter_db pg_restore \
  -U stratarouter \
  -d stratarouter \
  -c < backup_20260111.dump

# Verify data
docker exec stratarouter_db psql -U stratarouter -c "SELECT COUNT(*) FROM routes;"

# Restart application
docker-compose start stratarouter

Full System Recovery

# 1. Provision new infrastructure
terraform apply

# 2. Restore database
./restore-db.sh latest

# 3. Deploy application
docker-compose up -d

# 4. Verify health
curl http://new-host:8000/health

# 5. Update DNS
# Point api.example.com to new host

Monitoring

Key Metrics to Monitor

Metric Alert Threshold
Request Latency P99 > 50ms
Error Rate > 1%
CPU Usage > 80%
Memory Usage > 85%
Database Connections > 80% of pool
Cache Hit Rate < 70%
Disk Usage > 80%

Alerting Rules

# prometheus-alerts.yml
groups:
  - name: stratarouter
    interval: 30s
    rules:
      - alert: HighLatency
        expr: histogram_quantile(0.99, rate(http_request_duration_seconds_bucket[5m])) > 0.050
        for: 5m
        annotations:
          summary: "High latency detected"
          description: "P99 latency is {{ $value }}s"

      - alert: HighErrorRate
        expr: rate(http_requests_total{status=~"5.."}[5m]) / rate(http_requests_total[5m]) > 0.01
        for: 5m
        annotations:
          summary: "High error rate detected"

      - alert: DatabaseConnectionPoolExhausted
        expr: database_connections_active / database_connections_max > 0.8
        for: 2m
        annotations:
          summary: "Database connection pool nearly exhausted"

Next Steps

Production Checklist

Before going live:

  • SSL/TLS certificates configured
  • Database backups automated
  • Monitoring and alerting active
  • Load testing completed
  • Security audit passed
  • DNS configured with failover
  • Disaster recovery plan tested
  • Documentation updated
  • Team trained on operations
  • Runbooks created for common issues