Skip to content

Observability

Full production monitoring with metrics, traces, and logs.

Overview

StrataRouter Runtime provides comprehensive observability through:

  • Prometheus Metrics - Performance and usage metrics
  • OpenTelemetry Traces - Distributed tracing
  • Structured Logging - JSON logs with context

Prometheus Metrics

Expose Metrics

from stratarouter_runtime import MetricsExporter

exporter = MetricsExporter(port=9090)
exporter.start()

# Metrics available at http://localhost:9090/metrics

Key Metrics

Request Metrics:

stratarouter_runtime_requests_total{provider,route,status}
stratarouter_runtime_request_duration_seconds{provider,route}
stratarouter_runtime_requests_in_flight

Cache Metrics:

stratarouter_runtime_cache_hits_total
stratarouter_runtime_cache_misses_total
stratarouter_runtime_cache_hit_rate
stratarouter_runtime_cache_size_bytes

Cost Metrics:

stratarouter_runtime_cost_usd_total{provider}
stratarouter_runtime_tokens_used_total{provider,model}

Error Metrics:

stratarouter_runtime_errors_total{type,provider}
stratarouter_runtime_timeouts_total{provider}
stratarouter_runtime_retries_total{provider}

OpenTelemetry Tracing

from stratarouter_runtime import TracingConfig

tracing = TracingConfig(
    enabled=True,
    endpoint="http://localhost:4317",
    service_name="stratarouter",
    sample_rate=0.1
)

Trace Example

Span: execute_query (50ms)
├─ Span: route_decision (1ms)
├─ Span: cache_lookup (2ms)
├─ Span: llm_call (45ms)
│  ├─ Span: openai_request (44ms)
│  └─ Span: response_parse (1ms)
└─ Span: cache_store (2ms)

Structured Logging

import logging

logger = logging.getLogger("stratarouter")
logger.setLevel(logging.INFO)

# Logs include:
{
  "timestamp": "2026-01-11T10:30:00Z",
  "level": "INFO",
  "message": "Request completed",
  "request_id": "req-123",
  "user_id": "user-456",
  "route_id": "billing",
  "latency_ms": 45,
  "cost_usd": 0.002
}

Grafana Dashboard

Import pre-built dashboard:

# Download dashboard JSON
curl -o stratarouter-dashboard.json \
  https://raw.githubusercontent.com/agentdyne9/stratarouter/main/grafana/dashboard.json

# Import via Grafana API
curl -X POST http://localhost:3000/api/dashboards/import \
  -H "Content-Type: application/json" \
  -d @stratarouter-dashboard.json

The dashboard includes panels for:

  • Request throughput and error rate
  • P50/P95/P99 latency heatmaps
  • Cache hit rate and cost savings
  • Per-provider token usage and spend

Alerting

# alerts.yml
groups:
  - name: stratarouter
    rules:
      - alert: HighErrorRate
        expr: rate(stratarouter_runtime_errors_total[5m]) > 0.05
        for: 5m
        annotations:
          summary: "Error rate above 5%"

      - alert: LowCacheHitRate
        expr: stratarouter_runtime_cache_hit_rate < 0.7
        for: 10m
        annotations:
          summary: "Cache hit rate below 70%"

Key SLIs to Monitor

SLI Target Alert Threshold
P99 Latency < 100ms > 500ms
Error Rate < 1% > 5%
Cache Hit Rate > 80% < 70%
Throughput > 1K req/s < 500 req/s

Next Steps