Skip to content

Configuration Reference

Complete configuration guide for StrataRouter Core and Runtime.

Configuration Files

File Structure

config/
├── default.toml          # Default configuration
├── development.toml      # Development overrides
├── production.toml       # Production overrides
└── test.toml            # Test configuration

Loading Configuration

// Rust
use stratarouter_runtime::Config;

// Load with environment
let config = Config::load("production")?;

// Load from specific file
let config = Config::from_file("config/custom.toml")?;
# Python
from stratarouter_runtime import Config

# Load with environment
config = Config.load("production")

# From file
config = Config.from_file("config/custom.toml")

Server Configuration

HTTP server settings.

[server]
host = "0.0.0.0"              # Bind address
port = 8080                    # HTTP port
workers = 4                    # Number of worker threads
shutdown_timeout_seconds = 30   # Graceful shutdown timeout

Environment Variables:

export SERVER_HOST="0.0.0.0"
export SERVER_PORT="8080"
export SERVER_WORKERS="4"

Python:

config = RuntimeConfig(
    server_host="0.0.0.0",
    server_port=8080,
    server_workers=4
)


Database Configuration

PostgreSQL database settings.

[database]
url = "postgresql://localhost/stratarouter"
max_connections = 20
min_connections = 2
acquire_timeout_seconds = 30
idle_timeout_seconds = 600
max_lifetime_seconds = 1800

Environment Variables:

export DATABASE_URL="postgresql://user:pass@localhost/stratarouter"
export DATABASE_MAX_CONNECTIONS="20"
export DATABASE_MIN_CONNECTIONS="2"

Connection Pool Sizing: - Development: 2-5 connections - Production: 10-50 connections (based on load) - Formula: max_connections = (CPU cores × 2) + disk spindles


Cache Configuration

Semantic caching settings.

[cache]
enabled = true
max_entries = 10000
ttl_seconds = 3600              # 1 hour
persistent = true
cleanup_interval_seconds = 300   # 5 minutes
semantic_enabled = true
similarity_threshold = 0.8
similarity_method = "hybrid"     # "hybrid" | "cosine" | "euclidean"

Environment Variables:

export CACHE_ENABLED="true"
export CACHE_TTL_SECONDS="3600"
export CACHE_SIMILARITY_THRESHOLD="0.8"
export REDIS_URL="redis://localhost:6379"  # If using Redis backend

Python:

config = RuntimeConfig(
    cache_enabled=True,
    cache_max_entries=10000,
    cache_ttl=3600,
    cache_similarity_threshold=0.8
)

Tuning Guide:

Workload TTL Max Entries Similarity
Static content 24h 100K 0.95
Dynamic content 1h 10K 0.85
Real-time 5m 1K 0.80

Batch Processing Configuration

Request batching and deduplication.

[batch]
enabled = true
max_batch_size = 100
batch_timeout_ms = 100
enable_deduplication = true

Environment Variables:

export BATCH_ENABLED="true"
export BATCH_MAX_SIZE="100"
export BATCH_TIMEOUT_MS="100"

Python:

config = RuntimeConfig(
    batch_enabled=True,
    batch_max_size=100,
    batch_timeout_ms=100,
    batch_deduplication=True
)

Tuning: - Latency-sensitive: batch_timeout_ms = 10-50 - Throughput-optimized: batch_timeout_ms = 100-500 - Max batch size: Based on provider limits (OpenAI: 2048 tokens/batch)


Resource Limits

Execution resource limits.

[limits]
max_concurrent_executions = 10
max_workflow_steps = 1000
max_workflow_depth = 10
max_steps_per_execution = 10000
max_execution_duration_seconds = 3600   # 1 hour
max_memory_mb = 24000.0                 # 24 GB
max_cpu_cores = 4.0
max_workflows_per_tenant = 100
max_executions_per_day = 1000
allow_enterprise_features = false

Environment Variables:

export LIMITS_MAX_CONCURRENT_EXECUTIONS="10"
export LIMITS_MAX_EXECUTION_DURATION="3600"
export LIMITS_MAX_MEMORY_MB="24000"

Recommended Limits:

Tier Concurrent Duration Memory
Free 1 300s 1GB
Pro 10 3600s 8GB
Enterprise 100 7200s 32GB

Routing Configuration

Semantic routing settings.

[routing]
default_agent = "default-agent"
semantic_threshold = 0.7
max_alternatives = 5

Python:

config = RuntimeConfig(
    routing_default_agent="default-agent",
    routing_threshold=0.7,
    routing_max_alternatives=5
)


Logging Configuration

Structured logging settings.

[logging]
level = "info"                # "trace" | "debug" | "info" | "warn" | "error"
format = "pretty"             # "json" | "pretty" | "compact"
output = "stdout"             # "stdout" | "stderr" | "file"

Environment Variables:

export RUST_LOG="info"
export LOG_FORMAT="json"
export LOG_OUTPUT="stdout"

Log Levels: - trace: Very verbose, all operations - debug: Debug information, function calls - info: General information, startup/shutdown - warn: Warnings, deprecated usage - error: Errors only

Production Recommendation: - Level: info or warn - Format: json (structured) - Output: stdout (captured by log aggregator)


Distributed Tracing

OpenTelemetry configuration.

[tracing]
enabled = true
service_name = "stratarouter-runtime"
sample_rate = 1.0              # 0.0-1.0 (1.0 = 100%)

Environment Variables:

export OTEL_SERVICE_NAME="stratarouter-runtime"
export OTEL_EXPORTER_OTLP_ENDPOINT="http://localhost:4317"
export OTEL_TRACES_SAMPLER="always_on"  # or "traceidratio"
export OTEL_TRACES_SAMPLER_ARG="1.0"    # Sample rate

Sampling Strategies:

Environment Sample Rate Reasoning
Development 1.0 Trace everything
Staging 1.0 Full visibility
Production (low volume) 1.0 <10K req/day
Production (high volume) 0.1 >100K req/day

Metrics Configuration

Prometheus metrics.

[metrics]
enabled = true
port = 9090
endpoint = "/metrics"

Environment Variables:

export METRICS_ENABLED="true"
export METRICS_PORT="9090"
export PROMETHEUS_PORT="9090"

Accessing Metrics:

curl http://localhost:9090/metrics

Key Metrics: - stratarouter_runtime_executions_total - Total executions - stratarouter_runtime_cache_hit_rate - Cache effectiveness - stratarouter_runtime_latency_seconds - Execution latency - stratarouter_runtime_cost_usd - Cost tracking


Health Checks

Service health monitoring.

[health]
check_interval_seconds = 60
timeout_seconds = 10

Endpoints: - GET /health - Basic liveness check - GET /health/ready - Readiness check (includes dependencies)

Kubernetes:

livenessProbe:
  httpGet:
    path: /health
    port: 8080
  initialDelaySeconds: 30
  periodSeconds: 10

readinessProbe:
  httpGet:
    path: /health/ready
    port: 8080
  initialDelaySeconds: 5
  periodSeconds: 5


State Management

Execution state persistence.

[storage]
snapshots_enabled = true
snapshot_interval_steps = 100
retention_days = 30

Environment Variables:

export STORAGE_SNAPSHOTS_ENABLED="true"
export STORAGE_SNAPSHOT_INTERVAL="100"
export STORAGE_RETENTION_DAYS="30"

Snapshot Strategy: - Interval: Checkpoint every N steps - Retention: Keep snapshots for N days - Cleanup: Automatic pruning of old snapshots


Rate Limiting

Token bucket rate limiting.

[rate_limit]
requests_per_second = 100
burst_size = 200

Environment Variables:

export RATE_LIMIT_RPS="100"
export RATE_LIMIT_BURST="200"

Algorithm: Token bucket - Tokens refill at requests_per_second rate - Burst capacity allows temporary spikes - Rejected requests return HTTP 429

Tuning:

# Conservative (API protection)
requests_per_second = 10
burst_size = 20

# Moderate (normal usage)
requests_per_second = 100
burst_size = 200

# Aggressive (high throughput)
requests_per_second = 1000
burst_size = 2000


Quotas

Usage quotas and limits.

[quotas]
hourly_execution_limit = 100
daily_execution_limit = 1000
monthly_execution_limit = 30000

Environment Variables:

export QUOTA_HOURLY="100"
export QUOTA_DAILY="1000"
export QUOTA_MONTHLY="30000"

Enforcement: - Quotas are per-tenant/user - Exceeded quotas return HTTP 402 - Reset at period boundaries (UTC)


Complete Configuration Example

Production Configuration

# config/production.toml

[server]
host = "0.0.0.0"
port = 8080
workers = 8
shutdown_timeout_seconds = 30

[database]
url = "postgresql://user:pass@db.prod.example.com/stratarouter"
max_connections = 50
min_connections = 10
acquire_timeout_seconds = 30
idle_timeout_seconds = 600
max_lifetime_seconds = 1800

[cache]
enabled = true
max_entries = 100000
ttl_seconds = 3600
persistent = true
cleanup_interval_seconds = 300
semantic_enabled = true
similarity_threshold = 0.85
similarity_method = "hybrid"

[batch]
enabled = true
max_batch_size = 100
batch_timeout_ms = 50
enable_deduplication = true

[limits]
max_concurrent_executions = 100
max_workflow_steps = 10000
max_workflow_depth = 20
max_steps_per_execution = 100000
max_execution_duration_seconds = 7200
max_memory_mb = 32000.0
max_cpu_cores = 16.0
max_workflows_per_tenant = 1000
max_executions_per_day = 100000
allow_enterprise_features = true

[routing]
default_agent = "gpt-4"
semantic_threshold = 0.75
max_alternatives = 10

[logging]
level = "warn"
format = "json"
output = "stdout"

[tracing]
enabled = true
service_name = "stratarouter-runtime-prod"
sample_rate = 0.1

[metrics]
enabled = true
port = 9090
endpoint = "/metrics"

[health]
check_interval_seconds = 30
timeout_seconds = 5

[storage]
snapshots_enabled = true
snapshot_interval_steps = 100
retention_days = 90

[rate_limit]
requests_per_second = 1000
burst_size = 2000

[quotas]
hourly_execution_limit = 10000
daily_execution_limit = 100000
monthly_execution_limit = 3000000

Environment Variables

# .env.production

# Database
DATABASE_URL=postgresql://user:pass@db.prod.example.com/stratarouter
DATABASE_MAX_CONNECTIONS=50

# Cache
REDIS_URL=redis://cache.prod.example.com:6379
CACHE_TTL_SECONDS=3600

# Observability
RUST_LOG=warn
OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector:4317
OTEL_SERVICE_NAME=stratarouter-runtime-prod

# Rate Limiting
RATE_LIMIT_RPS=1000
RATE_LIMIT_BURST=2000

# Server
SERVER_PORT=8080
METRICS_PORT=9090

Configuration Validation

Validate configuration before deployment:

# Rust
cargo run --bin validate-config -- config/production.toml

# Python
python -m stratarouter.validate_config config/production.toml

Common Validation Errors: - Invalid database URL format - Port already in use - Missing required environment variables - Invalid threshold values (must be 0-1) - Negative timeout values


Best Practices

Security

  • ✅ Never commit secrets to version control
  • ✅ Use environment variables for sensitive data
  • ✅ Rotate database credentials regularly
  • ✅ Use TLS for database connections in production

Performance

  • ✅ Tune connection pools based on load
  • ✅ Enable caching in production
  • ✅ Use batch processing for high throughput
  • ✅ Monitor and adjust rate limits

Observability

  • ✅ Use structured logging (JSON) in production
  • ✅ Enable distributed tracing
  • ✅ Export metrics to Prometheus
  • ✅ Set up alerts for key metrics

Reliability

  • ✅ Configure appropriate timeouts
  • ✅ Enable state snapshots
  • ✅ Set resource limits
  • ✅ Configure health checks for k8s

Next Steps