Skip to content

REST API Reference

HTTP REST endpoints for the StrataRouter Runtime server.

Base URL

http://your-host:8080

All endpoints return JSON. Authentication is via Authorization: Bearer <token> header when enabled.


Health

GET /health

Liveness check. Returns 200 OK when the server is running.

{ "status": "ok", "version": "0.3.0" }

GET /health/ready

Readiness check — includes database and cache connectivity.

{
  "status": "ready",
  "checks": {
    "database": "ok",
    "cache": "ok",
    "router": "ok"
  }
}

Routing

POST /route

Route a query to the best matching route.

Request

{
  "query": "Where is my invoice?",
  "embedding": [0.12, -0.34, 0.56, "..."],
  "k": 1
}
Field Type Required Description
query string Yes The query text
embedding float[] Yes Query embedding vector
k int No Number of alternatives to return (default: 1)

Response

{
  "route_id": "billing",
  "scores": {
    "semantic": 0.91,
    "keyword": 0.85,
    "pattern": 0.00,
    "total": 0.88,
    "confidence": 0.93
  },
  "latency_ms": 1.3,
  "alternatives": []
}

Status codes: 200 OK · 400 Bad Request · 429 Rate Limit Exceeded


POST /execute

Route a query and execute it via a configured LLM provider. Returns both the routing decision and the LLM response.

Request

{
  "query": "Where is my invoice?",
  "embedding": [0.12, -0.34, "..."],
  "context": {
    "user_id": "user-123",
    "session_id": "sess-456"
  }
}
Field Type Required Description
query string Yes The query text
embedding float[] Yes Query embedding vector
context object No Execution context (user, session, metadata)

Response

{
  "route_id": "billing",
  "response": "Here is your invoice for March 2026...",
  "cache_hit": false,
  "latency_ms": 312.4,
  "cost_usd": 0.0021,
  "provider": "openai",
  "metadata": {
    "model": "gpt-4",
    "tokens_used": 180
  }
}

Route Management

GET /routes

List all registered routes.

Response

{
  "routes": [
    {
      "id": "billing",
      "description": "Billing and payment questions",
      "examples": ["Where's my invoice?"],
      "keywords": ["invoice", "payment"],
      "threshold": null
    }
  ],
  "total": 12
}

POST /routes

Create a new route.

Request

{
  "id": "billing",
  "description": "Billing and payment questions",
  "examples": ["Where's my invoice?", "Update payment"],
  "keywords": ["invoice", "payment", "billing"],
  "threshold": 0.75
}

Response: 201 Created with the full route object.


GET /routes/{route_id}

Get a single route by ID.

Response: Route object or 404 Not Found.


PUT /routes/{route_id}

Update an existing route. Accepts the same body as POST /routes.

Response: 200 OK with updated route object, or 404 Not Found.


DELETE /routes/{route_id}

Delete a route.

Response: 204 No Content or 404 Not Found.


Batch

POST /batch/route

Route multiple queries in a single request. More efficient than individual calls.

Request

{
  "queries": [
    { "query": "Where's my invoice?", "embedding": ["..."] },
    { "query": "App is crashing", "embedding": ["..."] }
  ]
}

Response

{
  "results": [
    { "route_id": "billing", "confidence": 0.93, "latency_ms": 1.2 },
    { "route_id": "support", "confidence": 0.88, "latency_ms": 1.1 }
  ],
  "batch_latency_ms": 2.8,
  "deduplicated": 0
}

Cache

GET /cache/stats

Get current cache statistics.

{
  "hit_rate": 0.847,
  "total_requests": 52840,
  "cache_hits": 44756,
  "cache_misses": 8084,
  "entries": 4312,
  "memory_mb": 128.4
}

DELETE /cache

Flush the entire cache.

Response: 204 No Content


DELETE /cache/{key}

Evict a specific cache entry by key.

Response: 204 No Content


Metrics

GET /metrics

Prometheus metrics in text format.

# HELP stratarouter_requests_total Total routing requests
# TYPE stratarouter_requests_total counter
stratarouter_requests_total{route="billing",status="success"} 14821

# HELP stratarouter_latency_seconds Routing latency histogram
# TYPE stratarouter_latency_seconds histogram
stratarouter_latency_seconds_bucket{le="0.01"} 48921
stratarouter_latency_seconds_bucket{le="0.05"} 52384
stratarouter_latency_seconds_bucket{le="+Inf"} 52840
stratarouter_latency_seconds_sum 241.3
stratarouter_latency_seconds_count 52840

# HELP stratarouter_cache_hit_rate Cache hit rate
# TYPE stratarouter_cache_hit_rate gauge
stratarouter_cache_hit_rate 0.847

Error Responses

All errors use a consistent JSON envelope:

{
  "error": "RateLimitExceeded",
  "message": "Rate limit: 100 req/s exceeded",
  "code": "RATE_LIMIT_EXCEEDED",
  "retry_after": 5
}

HTTP Status Code Reference

Code Error Meaning
400 InvalidRequest Invalid request body or missing fields
404 RouteNotFound Route ID does not exist
429 RateLimitExceeded Request rate limit exceeded
500 InternalError Internal server error
502 ProviderError LLM provider returned an error
504 ExecutionTimeout Execution exceeded timeout

Authentication

When authentication is enabled, include your API key in the Authorization header:

curl -X POST http://localhost:8080/route \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"query": "my invoice", "embedding": [...]}'

Rate Limiting

Rate limit headers are included in every response:

X-RateLimit-Limit: 100
X-RateLimit-Remaining: 87
X-RateLimit-Reset: 1709500800
Retry-After: 5     # Only on 429 responses

Quick Examples

Route a query

curl -X POST http://localhost:8080/route \
  -H "Content-Type: application/json" \
  -d '{
    "query": "I need my invoice",
    "embedding": [0.12, -0.34, 0.56]
  }'

List all routes

curl http://localhost:8080/routes

Check health

curl http://localhost:8080/health
curl http://localhost:8080/health/ready

Get Prometheus metrics

curl http://localhost:8080/metrics

Next Steps