Scaling¶
Horizontal and Vertical Scaling Guide
Scale StrataRouter to handle millions of requests per day.
Scaling Strategies¶
Vertical Scaling¶
Increase resources on a single instance
- Simple to implement
- No distributed complexity
- Limited by hardware
- Single point of failure
When to use: < 100K req/day
Horizontal Scaling¶
Add more instances
- Unlimited scalability
- High availability
- Requires load balancer
- More complex
When to use: > 100K req/day
Load Balancing¶
Kubernetes¶
apiVersion: apps/v1
kind: Deployment
metadata:
name: stratarouter
spec:
replicas: 3
selector:
matchLabels:
app: stratarouter
template:
metadata:
labels:
app: stratarouter
spec:
containers:
- name: stratarouter
image: stratarouter/stratarouter:latest
ports:
- containerPort: 8000
resources:
requests:
memory: "256Mi"
cpu: "500m"
limits:
memory: "1Gi"
cpu: "2000m"
---
apiVersion: v1
kind: Service
metadata:
name: stratarouter
spec:
selector:
app: stratarouter
ports:
- port: 80
targetPort: 8000
type: LoadBalancer
NGINX Load Balancer¶
upstream stratarouter {
least_conn;
server stratarouter-1:8000;
server stratarouter-2:8000;
server stratarouter-3:8000;
}
server {
listen 80;
location / {
proxy_pass http://stratarouter;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
}
}
Autoscaling¶
Kubernetes HPA¶
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: stratarouter-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: stratarouter
minReplicas: 3
maxReplicas: 20
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
Caching Strategy¶
Shared Redis Cache¶
# Redis for shared cache across instances
apiVersion: v1
kind: Service
metadata:
name: redis
spec:
ports:
- port: 6379
selector:
app: redis
Database Scaling¶
Read Replicas¶
config = RuntimeConfig(
database_primary="postgresql://primary:5432/stratarouter",
database_replicas=[
"postgresql://replica-1:5432/stratarouter",
"postgresql://replica-2:5432/stratarouter"
]
)
Connection Pooling¶
Multi-Region Deployment¶
Active-Active¶
US-EAST US-WEST EU-WEST
| | |
+----------------+----------------+
|
Load Balancer
|
Your Application
Configuration¶
config = RuntimeConfig(
region="us-east-1",
# Cross-region cache replication
cache_replicate_regions=["us-west-1", "eu-west-1"]
)
Performance Targets¶
| Deployment | Instances | RPS | P99 Latency | Cost/Month |
|---|---|---|---|---|
| Small | 1 | 5K | 8.7ms | $100 |
| Medium | 3 | 50K | 9.2ms | $300 |
| Large | 10 | 200K | 10.5ms | $1,000 |
| Enterprise | 50+ | 1M+ | 12ms | $5,000+ |
Capacity Planning¶
Sizing Guide¶
# Calculate required instances
queries_per_day = 5_000_000
peak_multiplier = 3.0 # 3x average during peak
queries_per_second_peak = (
queries_per_day / 86400 * peak_multiplier
)
instances_needed = queries_per_second_peak / 15_000
# → ~2 instances
Best Practices¶
Start with 3 instances — Minimum for HA Use autoscaling — Handle traffic spikes Shared cache — Avoid cold starts Monitor per-instance metrics — Find bottlenecks Load test — Validate scaling