Skip to content

Scaling

Horizontal and Vertical Scaling Guide

Scale StrataRouter to handle millions of requests per day.


Scaling Strategies

Vertical Scaling

Increase resources on a single instance

  • Simple to implement
  • No distributed complexity
  • Limited by hardware
  • Single point of failure

When to use: < 100K req/day

Horizontal Scaling

Add more instances

  • Unlimited scalability
  • High availability
  • Requires load balancer
  • More complex

When to use: > 100K req/day


Load Balancing

Kubernetes

apiVersion: apps/v1
kind: Deployment
metadata:
  name: stratarouter
spec:
  replicas: 3
  selector:
    matchLabels:
      app: stratarouter
  template:
    metadata:
      labels:
        app: stratarouter
    spec:
      containers:
      - name: stratarouter
        image: stratarouter/stratarouter:latest
        ports:
        - containerPort: 8000
        resources:
          requests:
            memory: "256Mi"
            cpu: "500m"
          limits:
            memory: "1Gi"
            cpu: "2000m"
---
apiVersion: v1
kind: Service
metadata:
  name: stratarouter
spec:
  selector:
    app: stratarouter
  ports:
  - port: 80
    targetPort: 8000
  type: LoadBalancer

NGINX Load Balancer

upstream stratarouter {
    least_conn;
    server stratarouter-1:8000;
    server stratarouter-2:8000;
    server stratarouter-3:8000;
}

server {
    listen 80;

    location / {
        proxy_pass http://stratarouter;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }
}

Autoscaling

Kubernetes HPA

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: stratarouter-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: stratarouter
  minReplicas: 3
  maxReplicas: 20
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80

Caching Strategy

Shared Redis Cache

# Redis for shared cache across instances
apiVersion: v1
kind: Service
metadata:
  name: redis
spec:
  ports:
  - port: 6379
  selector:
    app: redis
config = RuntimeConfig(
    cache_backend="redis",
    cache_url="redis://redis:6379"
)

Database Scaling

Read Replicas

config = RuntimeConfig(
    database_primary="postgresql://primary:5432/stratarouter",
    database_replicas=[
        "postgresql://replica-1:5432/stratarouter",
        "postgresql://replica-2:5432/stratarouter"
    ]
)

Connection Pooling

config = RuntimeConfig(
    db_pool_size=20,
    db_max_overflow=10,
    db_pool_timeout=30
)

Multi-Region Deployment

Active-Active

US-EAST          US-WEST          EU-WEST
   |                |                |
   +----------------+----------------+
                    |
              Load Balancer
                    |
              Your Application

Configuration

config = RuntimeConfig(
    region="us-east-1",

    # Cross-region cache replication
    cache_replicate_regions=["us-west-1", "eu-west-1"]
)

Performance Targets

Deployment Instances RPS P99 Latency Cost/Month
Small 1 5K 8.7ms $100
Medium 3 50K 9.2ms $300
Large 10 200K 10.5ms $1,000
Enterprise 50+ 1M+ 12ms $5,000+

Capacity Planning

Sizing Guide

# Calculate required instances
queries_per_day = 5_000_000
peak_multiplier = 3.0  # 3x average during peak

queries_per_second_peak = (
    queries_per_day / 86400 * peak_multiplier
)

instances_needed = queries_per_second_peak / 15_000
# → ~2 instances

Best Practices

Start with 3 instances — Minimum for HA Use autoscaling — Handle traffic spikes Shared cache — Avoid cold starts Monitor per-instance metrics — Find bottlenecks Load test — Validate scaling


Next Steps

Deployment

Deploy to production

Deploy Now →

Monitoring

Track performance

Setup Monitoring →