REST API Reference¶

HTTP REST endpoints for the StrataRouter Runtime server.

Base URL¶

http://your-host:8080

All endpoints return JSON. Authentication is via Authorization: Bearer <token> header when enabled.

Health¶

`GET /health`¶

Liveness check. Returns 200 OK when the server is running.

{ "status": "ok", "version": "0.3.0" }

`GET /health/ready`¶

Readiness check — includes database and cache connectivity.

{
  "status": "ready",
  "checks": {
    "database": "ok",
    "cache": "ok",
    "router": "ok"
  }
}

Routing¶

`POST /route`¶

Route a query to the best matching route.

Request

{
  "query": "Where is my invoice?",
  "embedding": [0.12, -0.34, 0.56, "..."],
  "k": 1
}

Field	Type	Required	Description
`query`	string	Yes	The query text
`embedding`	float[]	Yes	Query embedding vector
`k`	int	No	Number of alternatives to return (default: 1)

Response

{
  "route_id": "billing",
  "scores": {
    "semantic": 0.91,
    "keyword": 0.85,
    "pattern": 0.00,
    "total": 0.88,
    "confidence": 0.93
  },
  "latency_ms": 1.3,
  "alternatives": []
}

Status codes: 200 OK · 400 Bad Request · 429 Rate Limit Exceeded

`POST /execute`¶

Route a query and execute it via a configured LLM provider. Returns both the routing decision and the LLM response.

Request

{
  "query": "Where is my invoice?",
  "embedding": [0.12, -0.34, "..."],
  "context": {
    "user_id": "user-123",
    "session_id": "sess-456"
  }
}

Field	Type	Required	Description
`query`	string	Yes	The query text
`embedding`	float[]	Yes	Query embedding vector
`context`	object	No	Execution context (user, session, metadata)

Response

{
  "route_id": "billing",
  "response": "Here is your invoice for March 2026...",
  "cache_hit": false,
  "latency_ms": 312.4,
  "cost_usd": 0.0021,
  "provider": "openai",
  "metadata": {
    "model": "gpt-4",
    "tokens_used": 180
  }
}

Route Management¶

`GET /routes`¶

List all registered routes.

Response

{
  "routes": [
    {
      "id": "billing",
      "description": "Billing and payment questions",
      "examples": ["Where's my invoice?"],
      "keywords": ["invoice", "payment"],
      "threshold": null
    }
  ],
  "total": 12
}

`POST /routes`¶

Create a new route.

Request

{
  "id": "billing",
  "description": "Billing and payment questions",
  "examples": ["Where's my invoice?", "Update payment"],
  "keywords": ["invoice", "payment", "billing"],
  "threshold": 0.75
}

Response: 201 Created with the full route object.

`GET /routes/{route_id}`¶

Get a single route by ID.

Response: Route object or 404 Not Found.

`PUT /routes/{route_id}`¶

Update an existing route. Accepts the same body as POST /routes.

Response: 200 OK with updated route object, or 404 Not Found.

`DELETE /routes/{route_id}`¶

Delete a route.

Response: 204 No Content or 404 Not Found.

Batch¶

`POST /batch/route`¶

Route multiple queries in a single request. More efficient than individual calls.

Request

{
  "queries": [
    { "query": "Where's my invoice?", "embedding": ["..."] },
    { "query": "App is crashing", "embedding": ["..."] }
  ]
}

Response

{
  "results": [
    { "route_id": "billing", "confidence": 0.93, "latency_ms": 1.2 },
    { "route_id": "support", "confidence": 0.88, "latency_ms": 1.1 }
  ],
  "batch_latency_ms": 2.8,
  "deduplicated": 0
}

Cache¶

`GET /cache/stats`¶

Get current cache statistics.

{
  "hit_rate": 0.847,
  "total_requests": 52840,
  "cache_hits": 44756,
  "cache_misses": 8084,
  "entries": 4312,
  "memory_mb": 128.4
}

`DELETE /cache`¶

Flush the entire cache.

Response: 204 No Content

`DELETE /cache/{key}`¶

Evict a specific cache entry by key.

Response: 204 No Content

Metrics¶

`GET /metrics`¶

Prometheus metrics in text format.

# HELP stratarouter_requests_total Total routing requests
# TYPE stratarouter_requests_total counter
stratarouter_requests_total{route="billing",status="success"} 14821

# HELP stratarouter_latency_seconds Routing latency histogram
# TYPE stratarouter_latency_seconds histogram
stratarouter_latency_seconds_bucket{le="0.01"} 48921
stratarouter_latency_seconds_bucket{le="0.05"} 52384
stratarouter_latency_seconds_bucket{le="+Inf"} 52840
stratarouter_latency_seconds_sum 241.3
stratarouter_latency_seconds_count 52840

# HELP stratarouter_cache_hit_rate Cache hit rate
# TYPE stratarouter_cache_hit_rate gauge
stratarouter_cache_hit_rate 0.847

Error Responses¶

All errors use a consistent JSON envelope:

{
  "error": "RateLimitExceeded",
  "message": "Rate limit: 100 req/s exceeded",
  "code": "RATE_LIMIT_EXCEEDED",
  "retry_after": 5
}

HTTP Status Code Reference

Code	Error	Meaning
`400`	`InvalidRequest`	Invalid request body or missing fields
`404`	`RouteNotFound`	Route ID does not exist
`429`	`RateLimitExceeded`	Request rate limit exceeded
`500`	`InternalError`	Internal server error
`502`	`ProviderError`	LLM provider returned an error
`504`	`ExecutionTimeout`	Execution exceeded timeout

Authentication¶

When authentication is enabled, include your API key in the Authorization header:

curl -X POST http://localhost:8080/route \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"query": "my invoice", "embedding": [...]}'

Rate Limiting¶

Rate limit headers are included in every response:

X-RateLimit-Limit: 100
X-RateLimit-Remaining: 87
X-RateLimit-Reset: 1709500800
Retry-After: 5     # Only on 429 responses

Quick Examples¶

Route a query¶

curl -X POST http://localhost:8080/route \
  -H "Content-Type: application/json" \
  -d '{
    "query": "I need my invoice",
    "embedding": [0.12, -0.34, 0.56]
  }'

List all routes¶

curl http://localhost:8080/routes

Check health¶

curl http://localhost:8080/health
curl http://localhost:8080/health/ready

Get Prometheus metrics¶

curl http://localhost:8080/metrics

Next Steps¶

Python SDK

Use StrataRouter directly from Python without the HTTP layer.

→

CLI Reference

Manage routes and the server from the command line.

→

Configuration

Configure the Runtime server, cache, and rate limits.

→

Error Codes

All error codes, causes, and handling patterns.

→

REST API Reference¶

Base URL¶

Health¶

GET /health¶

GET /health/ready¶

Routing¶

POST /route¶

POST /execute¶

Route Management¶

GET /routes¶

POST /routes¶

GET /routes/{route_id}¶

PUT /routes/{route_id}¶

DELETE /routes/{route_id}¶

Batch¶

POST /batch/route¶

Cache¶

GET /cache/stats¶

DELETE /cache¶

DELETE /cache/{key}¶

Metrics¶

GET /metrics¶

Error Responses¶

Authentication¶

Rate Limiting¶

Quick Examples¶

Route a query¶

List all routes¶

Check health¶

Get Prometheus metrics¶

Next Steps¶

`GET /health`¶

`GET /health/ready`¶

`POST /route`¶

`POST /execute`¶

`GET /routes`¶

`POST /routes`¶

`GET /routes/{route_id}`¶

`PUT /routes/{route_id}`¶

`DELETE /routes/{route_id}`¶

`POST /batch/route`¶

`GET /cache/stats`¶

`DELETE /cache`¶

`DELETE /cache/{key}`¶

`GET /metrics`¶