Skip to content

Routing Engine

The StrataRouter routing engine uses a deterministic three-stage pipeline: HNSW candidate selection, hybrid scoring, and score fusion with confidence calibration.

Architecture

flowchart TB
    A["Query + Embedding"]
    subgraph Pipeline["Three-Stage Pipeline"]
        B["Stage 1: Candidate Selection\nHNSW Index — top-k=5, O(log N)"]
        C["Stage 2: Score Computation\nDense · Sparse (BM25) · Rule"]
        D["Stage 3: Fusion + Calibration\nWeighted 64/29/7 → Isotonic Regression"]
    end
    F["Best Route + Confidence"]
    A --> B --> C --> D --> F

Stage 1: Candidate Selection (HNSW)

Uses hierarchical navigable small world graphs for fast approximate nearest neighbor search.

How It Works

// Find nearest neighbors using HNSW
let candidates = hnsw_index.search(query_embedding, k=5);
// Returns top-5 most similar routes in <1ms

Performance

  • Query Time: O(log N) - sub-millisecond for 10K routes
  • Memory: O(N * M) where M = max connections (~16)
  • Accuracy: 95%+ recall@5

Configuration

router = Router(
    dimension=384,
    index_type="hnsw",
    m=16,  # Max connections per node
    ef_construction=200  # Construction time quality
)

Stage 2: Score Computation

Dense Scoring (Semantic Similarity)

Computes cosine similarity between query and route embeddings.

dense_score = cosine_similarity(query_embedding, route_embedding)
# Range: [-1, 1], typically [0.3, 0.95] for relevant routes

When it excels: - Semantic understanding - Paraphrasing detection - Similar concepts with different words

Example:

Query: "Where's my receipt?"
Route: "Billing questions"
Dense Score: 0.87 ✓ (high semantic match)

Sparse Scoring (BM25)

Keyword-based scoring using TF-IDF with length normalization.

# BM25 formula
score = Σ IDF(term) * (tf * (k1 + 1)) / (tf + k1 * (1 - b + b * |D|/avgdl))

# Parameters:
# k1 = 1.5 (term frequency saturation)
# b = 0.75 (length normalization)

When it excels: - Exact keyword matches - Domain-specific terminology - Acronyms and proper nouns

Example:

Query: "I need my invoice"
Route keywords: ["invoice", "billing", "payment"]
Sparse Score: 0.92 ✓ (exact match on "invoice")

Rule Matching (Pattern Detection)

Exact pattern matching for deterministic routing.

if pattern in query.lower():
    rule_score = 1.0
else:
    rule_score = 0.0

When it excels: - Specific commands - Known phrases - Guaranteed routing

Example:

Query: "reset my password"
Pattern: "reset password"
Rule Score: 1.0 ✓ (exact match)

Stage 3: Score Fusion

Combines all three scores using learned weights.

# Learned weights from training data
WEIGHT_DENSE = 0.64   # 64% semantic
WEIGHT_SPARSE = 0.29  # 29% keyword
WEIGHT_RULE = 0.07    # 7% rule

# Weighted fusion
fused_score = (
    WEIGHT_DENSE * dense_score +
    WEIGHT_SPARSE * sparse_score +
    WEIGHT_RULE * rule_score
)

# Apply sigmoid normalization
confidence = 1 / (1 + exp(-fused_score))

Why These Weights?

Based on analysis of 10,000+ queries: - Dense (64%): Best for general semantic understanding - Sparse (29%): Critical for exact term matches - Rule (7%): Important for deterministic cases

Stage 3 (cont.): Confidence Calibration

Uses isotonic regression to map raw scores to calibrated confidence.

How It Works

# Collect (score, accuracy) pairs from validation data
calibration_data = [
    (0.85, 0.95),  # score 0.85 → 95% accurate
    (0.75, 0.88),  # score 0.75 → 88% accurate
    (0.65, 0.78),  # score 0.65 → 78% accurate
]

# Fit monotonic curve
calibrator = IsotonicRegression()
calibrator.fit(scores, accuracies)

# Apply calibration
calibrated_confidence = calibrator.predict(fused_score)

Benefits

  • Accurate Confidence: Confidence score = actual accuracy
  • Per-Route: Each route has its own calibration curve
  • Maintains Ranking: Doesn't change route order

Calibration Curves

Raw Score → Calibrated Confidence

Route A (Easy):
0.7 → 0.85
0.8 → 0.92
0.9 → 0.98

Route B (Hard):
0.7 → 0.72
0.8 → 0.81
0.9 → 0.90

Complete Example

from stratarouter import Router, Route

# Note: Router and Route are the public API classes
# They wrap the underlying Rust implementation for optimal performance

# Create router
router = Router(dimension=384, threshold=0.5)

# Add route with all features
billing = Route("billing")
billing.description = "Billing and payment questions"
billing.examples = ["Where's my invoice?", "I need a refund"]
billing.keywords = ["invoice", "billing", "payment", "refund", "charge"]
router.add_route(billing)

# Build index
embeddings = model.encode([billing.description])
router.build_index(embeddings.tolist())

# Route query
query = "I need my invoice from last month"
query_emb = model.encode([query])[0]

result = router.route(query, query_emb.tolist())

# Score breakdown
print(f"Route: {result.route_id}")           # billing
print(f"Dense:  {result.scores.semantic:.3f}")   # 0.891
print(f"Sparse: {result.scores.keyword:.3f}")    # 0.943
print(f"Rule:   {result.scores.pattern:.3f}")    # 0.000
print(f"Fused:  {result.scores.total:.3f}")      # 0.906
print(f"Confidence: {result.scores.confidence:.3f}") # 0.923
print(f"Latency: {result.latency_ms:.2f}ms")        # 2.3ms

Performance Characteristics

Stage Time Memory Accuracy
HNSW Search 0.8ms 50MB 95% recall
Dense Score 0.2ms - High
Sparse Score 0.5ms 10MB Medium
Rule Match 0.1ms 2MB Perfect
Fusion 0.1ms - -
Calibration 0.1ms 2MB -
Total 1.8ms 64MB 95.4%

Optimization Tips

1. Tune Candidate Selection

# More candidates = higher accuracy, slower
router = Router(dimension=384, top_k=10)  # Default: 5

# Fewer candidates = faster, slightly less accurate
router = Router(dimension=384, top_k=3)

2. Adjust Threshold

# Strict (fewer false positives)
router = Router(threshold=0.8)

# Lenient (fewer false negatives)
router = Router(threshold=0.3)

# Balanced (default)
router = Router(threshold=0.5)

3. Optimize Keywords

# Add important keywords for better sparse matching
route.keywords = [
    "invoice",          # Core term
    "bill",             # Synonym
    "receipt",          # Related
    "statement",        # Variation
]

4. Use Patterns for Known Phrases

# Add patterns for guaranteed routing
route.patterns = [
    "reset password",
    "forgot password",
    "change password"
]

Advanced Topics

Custom Scoring Weights

While the default weights (64/29/7) work well for most cases, you can adjust them:

# Emphasize keywords over semantics
router = Router(
    dimension=384,
    weight_dense=0.5,
    weight_sparse=0.4,
    weight_rule=0.1
)

Multi-Stage Routing

For complex scenarios, use multi-stage routing:

# Stage 1: Coarse-grained routing (department)
dept_router = Router(dimension=384)
dept_result = dept_router.route(query, embedding)

# Stage 2: Fine-grained routing (specific handler)
if dept_result['route_id'] == 'billing':
    billing_router = Router(dimension=384)
    handler_result = billing_router.route(query, embedding)

Batch Routing

For high throughput, batch multiple queries:

queries = ["query1", "query2", "query3"]
embeddings = model.encode(queries)

results = []
for query, emb in zip(queries, embeddings):
    result = router.route(query, emb.tolist())
    results.append(result)

Troubleshooting

Low Confidence Scores

Problem: All routes have confidence < 0.5

Solutions: 1. Add more examples to routes 2. Improve keyword coverage 3. Lower threshold 4. Check embedding quality

Wrong Routes Selected

Problem: Router consistently picks wrong route

Solutions: 1. Add negative examples 2. Improve route descriptions 3. Add discriminative keywords 4. Check for overlapping routes

High Latency

Problem: Routing takes > 10ms

Solutions: 1. Reduce number of routes 2. Decrease top_k parameter 3. Use smaller embedding dimension 4. Enable SIMD optimizations

Next Steps