Skip to content

Algorithms

Deep dive into the algorithms powering StrataRouter's routing engine.

HNSW (Hierarchical Navigable Small World)

Fast approximate nearest neighbor search for semantic routing.

Overview

HNSW builds a multi-layer graph structure where: - Top layers: Sparse, long-range connections for fast navigation - Bottom layer: Dense connections for accurate final search

Algorithm

def search(query, k):
    # Start from top layer
    current = entry_point

    # Navigate down layers
    for layer in range(max_layer, 0, -1):
        current = search_layer(query, current, layer, ef=1)

    # Search bottom layer thoroughly
    candidates = search_layer(query, current, 0, ef=ef_search)

    # Return top-k nearest
    return select_neighbors(candidates, k)

Construction

def insert(point):
    # Determine layer (exponential distribution)
    layer = random_level()

    # Insert into each layer
    for l in range(layer + 1):
        # Find M nearest neighbors
        neighbors = search_layer(point, entry, l, ef_construction)

        # Connect bidirectionally
        for neighbor in neighbors[:M]:
            add_edge(point, neighbor)
            add_edge(neighbor, point)

Parameters

Parameter Default Description
M 16 Max connections per node
ef_construction 200 Construction quality (higher = better)
ef_search 50 Search quality (higher = more accurate)
max_layer log(N) Number of layers

Time Complexity

  • Construction: O(N log N * M)
  • Query: O(log N * M)
  • Space: O(N * M)

Performance Tuning

# Fast construction, good quality
router = Router(m=16, ef_construction=200)

# Slower construction, better quality
router = Router(m=32, ef_construction=400)

# Faster search
router = Router(ef_search=30)

# More accurate search
router = Router(ef_search=100)

When to Use

Best for: - Large datasets (1K+ routes) - Sub-millisecond latency required - High-dimensional vectors (384+)

Avoid when: - Very few routes (<10) - Need exact nearest neighbors - Memory constrained (<64MB)

BM25 (Best Matching 25)

Probabilistic information retrieval for keyword matching.

Formula

score(D, Q) = Σ IDF(qi) * (f(qi, D) * (k1 + 1)) / (f(qi, D) + k1 * (1 - b + b * |D|/avgdl))

where:
  D = document (route)
  Q = query
  qi = query term
  f(qi, D) = term frequency in D
  |D| = document length
  avgdl = average document length

Components

1. IDF (Inverse Document Frequency)

def idf(term, documents):
    df = count_documents_containing(term, documents)
    N = len(documents)
    return log((N - df + 0.5) / (df + 0.5) + 1)

Intuition: Rare terms are more discriminative

2. Term Frequency Saturation

def tf_component(tf, doc_length, avg_length, k1=1.5, b=0.75):
    normalized_length = doc_length / avg_length
    denominator = tf + k1 * (1 - b + b * normalized_length)
    return tf * (k1 + 1) / denominator

Intuition: More occurrences help, but with diminishing returns

Parameters

Parameter Default Effect
k1 1.5 Term frequency saturation (0 = binary, ∞ = raw TF)
b 0.75 Length normalization (0 = none, 1 = full)

Example

# Query: "invoice payment"
# Route: "Billing questions about invoices, payments, and refunds"

# Step 1: Tokenize
query_terms = ["invoice", "payment"]
doc_terms = ["billing", "questions", "about", "invoices", 
             "payments", "and", "refunds"]

# Step 2: Calculate IDF
idf("invoice") = log((100 - 20 + 0.5) / (20 + 0.5) + 1) = 1.58
idf("payment") = log((100 - 15 + 0.5) / (15 + 0.5) + 1) = 1.79

# Step 3: Calculate TF component
tf("invoice", doc) = 1
tf_component = 1 * (1.5 + 1) / (1 + 1.5 * (1 - 0.75 + 0.75 * 7/10))
             = 2.5 / 2.025 = 1.23

# Step 4: Sum scores
score = 1.58 * 1.23 + 1.79 * 1.23 = 4.14

Optimization

# Precompute IDF scores
idf_cache = {}
for term in vocabulary:
    idf_cache[term] = calculate_idf(term)

# Fast lookup during query
def score_query(query, doc):
    score = 0
    for term in query:
        if term in idf_cache:
            score += idf_cache[term] * tf_component(term, doc)
    return score

When to Use

Best for: - Exact keyword matches needed - Short documents (route descriptions) - Domain-specific terminology

Avoid when: - Semantic similarity more important - Paraphrasing common - No keywords available

Isotonic Regression

Monotonic calibration for confidence scores.

Problem

Raw scores don't match actual accuracy:

Raw Score → Actual Accuracy
0.60      → 0.71  (underconfident)
0.70      → 0.76  (underconfident)
0.80      → 0.88  (slightly under)
0.90      → 0.91  (well calibrated)

Solution

Fit a monotonic curve that maps scores to accuracy:

from sklearn.isotonic import IsotonicRegression

# Collect validation data
scores = [0.60, 0.65, 0.70, 0.75, 0.80, 0.85, 0.90, 0.95]
accuracies = [0.71, 0.74, 0.76, 0.80, 0.88, 0.90, 0.91, 0.97]

# Fit monotonic curve
calibrator = IsotonicRegression(out_of_bounds='clip')
calibrator.fit(scores, accuracies)

# Calibrate new scores
raw_score = 0.73
calibrated = calibrator.predict([raw_score])[0]  # 0.78

Algorithm

1. Sort (score, accuracy) pairs by score
2. Pool adjacent violators (PAV algorithm):
   - Scan from left to right
   - If non-monotonic, average adjacent points
   - Repeat until monotonic
3. Return piecewise constant function

PAV Algorithm

def pool_adjacent_violators(x, y):
    """Pool Adjacent Violators Algorithm"""
    n = len(x)
    solution = y.copy()

    i = 0
    while i < n - 1:
        if solution[i] > solution[i + 1]:
            # Violation found, pool
            pool_start = i
            pool_end = i + 1

            # Extend pool
            while pool_end < n and solution[pool_end - 1] > solution[pool_end]:
                pool_end += 1

            # Average pool
            pool_mean = sum(solution[pool_start:pool_end]) / (pool_end - pool_start)
            for j in range(pool_start, pool_end):
                solution[j] = pool_mean

            i = pool_end
        else:
            i += 1

    return solution

Per-Route Calibration

Each route gets its own calibration curve:

# Easy route (high baseline accuracy)
easy_calibrator = IsotonicRegression()
easy_calibrator.fit([0.6, 0.7, 0.8, 0.9], [0.82, 0.88, 0.93, 0.98])

# Hard route (lower baseline accuracy)
hard_calibrator = IsotonicRegression()
hard_calibrator.fit([0.6, 0.7, 0.8, 0.9], [0.65, 0.72, 0.81, 0.90])

# Apply appropriate calibration
if route == 'easy':
    confidence = easy_calibrator.predict([raw_score])
else:
    confidence = hard_calibrator.predict([raw_score])

Benefits

  1. Accurate Confidence: Confidence = expected accuracy
  2. Maintains Ranking: Doesn't change route order
  3. Per-Route: Accounts for route difficulty
  4. Fast: O(log N) lookup with binary search

When to Use

Always use for production systems where confidence matters

Skip only for prototyping or when speed is absolutely critical (<1ms)

SIMD Optimization

Hardware-accelerated vector operations for speed.

Overview

StrataRouter Core uses SIMD (Single Instruction Multiple Data) operations to accelerate similarity computations by up to 10x. The implementation is found in src/algorithms/simd_ops.rs and automatically detects the best available instruction set.

Supported Operations

// Cosine similarity with SIMD acceleration
pub fn cosine_similarity(a: &[f32], b: &[f32]) -> f32 {
    #[cfg(all(target_arch = "x86_64", target_feature = "avx2"))]
    {
        unsafe { cosine_similarity_avx2(a, b) }
    }

    #[cfg(not(all(target_arch = "x86_64", target_feature = "avx2")))]
    {
        cosine_similarity_scalar(a, b)
    }
}

// Scalar fallback for non-AVX2 systems
fn cosine_similarity_scalar(a: &[f32], b: &[f32]) -> f32 {
    let dot: f32 = a.iter().zip(b.iter()).map(|(x, y)| x * y).sum();
    let norm_a: f32 = a.iter().map(|x| x * x).sum::<f32>().sqrt();
    let norm_b: f32 = b.iter().map(|x| x * x).sum::<f32>().sqrt();
    dot / (norm_a * norm_b)
}

Performance Gain

Operation Scalar AVX2 NEON Speedup
Dot Product (384D) 45ns 12ns 18ns 2.5-3.8x
Cosine Similarity 120ns 35ns 50ns 2.4-3.4x
Euclidean Distance 90ns 28ns 40ns 2.3-3.2x

Auto-Detection

// Automatically uses best available instruction set
let similarity = if is_x86_feature_detected!("avx2") {
    cosine_similarity_avx2(a, b)
} else if is_x86_feature_detected!("sse4.1") {
    cosine_similarity_sse(a, b)
} else {
    cosine_similarity_scalar(a, b)
}

Requirements

  • x86_64: AVX2 (2013+) or SSE4.1 (2006+)
  • ARM: NEON (most ARM CPUs)
  • Fallback: Scalar operations for all CPUs

Score Fusion

Combining multiple scores into a single confidence.

Weighted Average

# Learned weights
w_dense = 0.64
w_sparse = 0.29
w_rule = 0.07

# Fusion
fused = w_dense * dense_score + w_sparse * sparse_score + w_rule * rule_score

Normalization

# Sigmoid normalization
confidence = 1 / (1 + exp(-fused))

# Alternative: Min-max scaling
confidence = (fused - min_score) / (max_score - min_score)

# Alternative: Softmax (multi-route)
exp_scores = [exp(score) for score in all_scores]
confidence = exp(fused) / sum(exp_scores)

Learning Weights

from sklearn.linear_model import LogisticRegression

# Training data
X = [[dense1, sparse1, rule1],
     [dense2, sparse2, rule2],
     ...]
y = [correct1, correct2, ...]  # 1 if correct, 0 if wrong

# Train
model = LogisticRegression()
model.fit(X, y)

# Extract weights
w_dense, w_sparse, w_rule = model.coef_[0]

Comparison

Algorithm Latency Memory Accuracy Use Case
HNSW 0.8ms 50MB 95% Semantic search
BM25 0.5ms 10MB 85% Keyword matching
Rules 0.1ms 2MB 100% Exact patterns
Isotonic 0.1ms 2MB N/A Calibration
SIMD 0.01ms 0MB N/A Acceleration

Next Steps