Configuration Reference¶

Complete configuration guide for StrataRouter Core and Runtime.

Configuration Files¶

File Structure¶

config/
├── default.toml          # Default configuration
├── development.toml      # Development overrides
├── production.toml       # Production overrides
└── test.toml            # Test configuration

Loading Configuration¶

// Rust
use stratarouter_runtime::Config;

// Load with environment
let config = Config::load("production")?;

// Load from specific file
let config = Config::from_file("config/custom.toml")?;

# Python
from stratarouter_runtime import Config

# Load with environment
config = Config.load("production")

# From file
config = Config.from_file("config/custom.toml")

Server Configuration¶

HTTP server settings.

[server]
host = "0.0.0.0"              # Bind address
port = 8080                    # HTTP port
workers = 4                    # Number of worker threads
shutdown_timeout_seconds = 30   # Graceful shutdown timeout

Environment Variables:

export SERVER_HOST="0.0.0.0"
export SERVER_PORT="8080"
export SERVER_WORKERS="4"

Python:

config = RuntimeConfig(
    server_host="0.0.0.0",
    server_port=8080,
    server_workers=4
)

Database Configuration¶

PostgreSQL database settings.

[database]
url = "postgresql://localhost/stratarouter"
max_connections = 20
min_connections = 2
acquire_timeout_seconds = 30
idle_timeout_seconds = 600
max_lifetime_seconds = 1800

Environment Variables:

export DATABASE_URL="postgresql://user:pass@localhost/stratarouter"
export DATABASE_MAX_CONNECTIONS="20"
export DATABASE_MIN_CONNECTIONS="2"

Connection Pool Sizing: - Development: 2-5 connections - Production: 10-50 connections (based on load) - Formula: max_connections = (CPU cores × 2) + disk spindles

Cache Configuration¶

Semantic caching settings.

[cache]
enabled = true
max_entries = 10000
ttl_seconds = 3600              # 1 hour
persistent = true
cleanup_interval_seconds = 300   # 5 minutes
semantic_enabled = true
similarity_threshold = 0.8
similarity_method = "hybrid"     # "hybrid" | "cosine" | "euclidean"

Environment Variables:

export CACHE_ENABLED="true"
export CACHE_TTL_SECONDS="3600"
export CACHE_SIMILARITY_THRESHOLD="0.8"
export REDIS_URL="redis://localhost:6379"  # If using Redis backend

Python:

config = RuntimeConfig(
    cache_enabled=True,
    cache_max_entries=10000,
    cache_ttl=3600,
    cache_similarity_threshold=0.8
)

Tuning Guide:

Workload	TTL	Max Entries	Similarity
Static content	24h	100K	0.95
Dynamic content	1h	10K	0.85
Real-time	5m	1K	0.80

Batch Processing Configuration¶

Request batching and deduplication.

[batch]
enabled = true
max_batch_size = 100
batch_timeout_ms = 100
enable_deduplication = true

Environment Variables:

export BATCH_ENABLED="true"
export BATCH_MAX_SIZE="100"
export BATCH_TIMEOUT_MS="100"

Python:

config = RuntimeConfig(
    batch_enabled=True,
    batch_max_size=100,
    batch_timeout_ms=100,
    batch_deduplication=True
)

Tuning: - Latency-sensitive: batch_timeout_ms = 10-50 - Throughput-optimized: batch_timeout_ms = 100-500 - Max batch size: Based on provider limits (OpenAI: 2048 tokens/batch)

Resource Limits¶

Execution resource limits.

[limits]
max_concurrent_executions = 10
max_workflow_steps = 1000
max_workflow_depth = 10
max_steps_per_execution = 10000
max_execution_duration_seconds = 3600   # 1 hour
max_memory_mb = 24000.0                 # 24 GB
max_cpu_cores = 4.0
max_workflows_per_tenant = 100
max_executions_per_day = 1000
allow_enterprise_features = false

Environment Variables:

export LIMITS_MAX_CONCURRENT_EXECUTIONS="10"
export LIMITS_MAX_EXECUTION_DURATION="3600"
export LIMITS_MAX_MEMORY_MB="24000"

Recommended Limits:

Tier	Concurrent	Duration	Memory
Free	1	300s	1GB
Pro	10	3600s	8GB
Enterprise	100	7200s	32GB

Routing Configuration¶

Semantic routing settings.

[routing]
default_agent = "default-agent"
semantic_threshold = 0.7
max_alternatives = 5

Python:

config = RuntimeConfig(
    routing_default_agent="default-agent",
    routing_threshold=0.7,
    routing_max_alternatives=5
)

Logging Configuration¶

Structured logging settings.

[logging]
level = "info"                # "trace" | "debug" | "info" | "warn" | "error"
format = "pretty"             # "json" | "pretty" | "compact"
output = "stdout"             # "stdout" | "stderr" | "file"

Environment Variables:

export RUST_LOG="info"
export LOG_FORMAT="json"
export LOG_OUTPUT="stdout"

Log Levels: - trace: Very verbose, all operations - debug: Debug information, function calls - info: General information, startup/shutdown - warn: Warnings, deprecated usage - error: Errors only

Production Recommendation: - Level: info or warn - Format: json (structured) - Output: stdout (captured by log aggregator)

Distributed Tracing¶

OpenTelemetry configuration.

[tracing]
enabled = true
service_name = "stratarouter-runtime"
sample_rate = 1.0              # 0.0-1.0 (1.0 = 100%)

Environment Variables:

export OTEL_SERVICE_NAME="stratarouter-runtime"
export OTEL_EXPORTER_OTLP_ENDPOINT="http://localhost:4317"
export OTEL_TRACES_SAMPLER="always_on"  # or "traceidratio"
export OTEL_TRACES_SAMPLER_ARG="1.0"    # Sample rate

Sampling Strategies:

Environment	Sample Rate	Reasoning
Development	1.0	Trace everything
Staging	1.0	Full visibility
Production (low volume)	1.0	<10K req/day
Production (high volume)	0.1	>100K req/day

Metrics Configuration¶

Prometheus metrics.

[metrics]
enabled = true
port = 9090
endpoint = "/metrics"

Environment Variables:

export METRICS_ENABLED="true"
export METRICS_PORT="9090"
export PROMETHEUS_PORT="9090"

Accessing Metrics:

curl http://localhost:9090/metrics

Key Metrics: - stratarouter_runtime_executions_total - Total executions - stratarouter_runtime_cache_hit_rate - Cache effectiveness - stratarouter_runtime_latency_seconds - Execution latency - stratarouter_runtime_cost_usd - Cost tracking

Health Checks¶

Service health monitoring.

[health]
check_interval_seconds = 60
timeout_seconds = 10

Endpoints: - GET /health - Basic liveness check - GET /health/ready - Readiness check (includes dependencies)

Kubernetes:

livenessProbe:
  httpGet:
    path: /health
    port: 8080
  initialDelaySeconds: 30
  periodSeconds: 10

readinessProbe:
  httpGet:
    path: /health/ready
    port: 8080
  initialDelaySeconds: 5
  periodSeconds: 5

State Management¶

Execution state persistence.

[storage]
snapshots_enabled = true
snapshot_interval_steps = 100
retention_days = 30

Environment Variables:

export STORAGE_SNAPSHOTS_ENABLED="true"
export STORAGE_SNAPSHOT_INTERVAL="100"
export STORAGE_RETENTION_DAYS="30"

Snapshot Strategy: - Interval: Checkpoint every N steps - Retention: Keep snapshots for N days - Cleanup: Automatic pruning of old snapshots

Rate Limiting¶

Token bucket rate limiting.

[rate_limit]
requests_per_second = 100
burst_size = 200

Environment Variables:

export RATE_LIMIT_RPS="100"
export RATE_LIMIT_BURST="200"

Algorithm: Token bucket - Tokens refill at requests_per_second rate - Burst capacity allows temporary spikes - Rejected requests return HTTP 429

Tuning:

# Conservative (API protection)
requests_per_second = 10
burst_size = 20

# Moderate (normal usage)
requests_per_second = 100
burst_size = 200

# Aggressive (high throughput)
requests_per_second = 1000
burst_size = 2000

Quotas¶

Usage quotas and limits.

[quotas]
hourly_execution_limit = 100
daily_execution_limit = 1000
monthly_execution_limit = 30000

Environment Variables:

export QUOTA_HOURLY="100"
export QUOTA_DAILY="1000"
export QUOTA_MONTHLY="30000"

Enforcement: - Quotas are per-tenant/user - Exceeded quotas return HTTP 402 - Reset at period boundaries (UTC)

Complete Configuration Example¶

Production Configuration¶

# config/production.toml

[server]
host = "0.0.0.0"
port = 8080
workers = 8
shutdown_timeout_seconds = 30

[database]
url = "postgresql://user:pass@db.prod.example.com/stratarouter"
max_connections = 50
min_connections = 10
acquire_timeout_seconds = 30
idle_timeout_seconds = 600
max_lifetime_seconds = 1800

[cache]
enabled = true
max_entries = 100000
ttl_seconds = 3600
persistent = true
cleanup_interval_seconds = 300
semantic_enabled = true
similarity_threshold = 0.85
similarity_method = "hybrid"

[batch]
enabled = true
max_batch_size = 100
batch_timeout_ms = 50
enable_deduplication = true

[limits]
max_concurrent_executions = 100
max_workflow_steps = 10000
max_workflow_depth = 20
max_steps_per_execution = 100000
max_execution_duration_seconds = 7200
max_memory_mb = 32000.0
max_cpu_cores = 16.0
max_workflows_per_tenant = 1000
max_executions_per_day = 100000
allow_enterprise_features = true

[routing]
default_agent = "gpt-4"
semantic_threshold = 0.75
max_alternatives = 10

[logging]
level = "warn"
format = "json"
output = "stdout"

[tracing]
enabled = true
service_name = "stratarouter-runtime-prod"
sample_rate = 0.1

[metrics]
enabled = true
port = 9090
endpoint = "/metrics"

[health]
check_interval_seconds = 30
timeout_seconds = 5

[storage]
snapshots_enabled = true
snapshot_interval_steps = 100
retention_days = 90

[rate_limit]
requests_per_second = 1000
burst_size = 2000

[quotas]
hourly_execution_limit = 10000
daily_execution_limit = 100000
monthly_execution_limit = 3000000

Environment Variables¶

# .env.production

# Database
DATABASE_URL=postgresql://user:pass@db.prod.example.com/stratarouter
DATABASE_MAX_CONNECTIONS=50

# Cache
REDIS_URL=redis://cache.prod.example.com:6379
CACHE_TTL_SECONDS=3600

# Observability
RUST_LOG=warn
OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector:4317
OTEL_SERVICE_NAME=stratarouter-runtime-prod

# Rate Limiting
RATE_LIMIT_RPS=1000
RATE_LIMIT_BURST=2000

# Server
SERVER_PORT=8080
METRICS_PORT=9090

Configuration Validation¶

Validate configuration before deployment:

# Rust
cargo run --bin validate-config -- config/production.toml

# Python
python -m stratarouter.validate_config config/production.toml

Common Validation Errors: - Invalid database URL format - Port already in use - Missing required environment variables - Invalid threshold values (must be 0-1) - Negative timeout values

Configuration Reference¶

Configuration Files¶

File Structure¶

Loading Configuration¶

Server Configuration¶

Database Configuration¶

Cache Configuration¶

Batch Processing Configuration¶

Resource Limits¶

Routing Configuration¶

Logging Configuration¶

Distributed Tracing¶

Metrics Configuration¶

Health Checks¶

State Management¶

Rate Limiting¶

Quotas¶

Complete Configuration Example¶

Production Configuration¶

Environment Variables¶

Configuration Validation¶

Best Practices¶

Security¶

Performance¶

Observability¶

Reliability¶

Next Steps¶