Algorithms¶

Deep dive into the algorithms powering StrataRouter's routing engine.

HNSW (Hierarchical Navigable Small World)¶

Fast approximate nearest neighbor search for semantic routing.

Overview¶

HNSW builds a multi-layer graph structure where: - Top layers: Sparse, long-range connections for fast navigation - Bottom layer: Dense connections for accurate final search

Algorithm¶

def search(query, k):
    # Start from top layer
    current = entry_point

    # Navigate down layers
    for layer in range(max_layer, 0, -1):
        current = search_layer(query, current, layer, ef=1)

    # Search bottom layer thoroughly
    candidates = search_layer(query, current, 0, ef=ef_search)

    # Return top-k nearest
    return select_neighbors(candidates, k)

Construction¶

def insert(point):
    # Determine layer (exponential distribution)
    layer = random_level()

    # Insert into each layer
    for l in range(layer + 1):
        # Find M nearest neighbors
        neighbors = search_layer(point, entry, l, ef_construction)

        # Connect bidirectionally
        for neighbor in neighbors[:M]:
            add_edge(point, neighbor)
            add_edge(neighbor, point)

Parameters¶

Parameter	Default	Description
`M`	16	Max connections per node
`ef_construction`	200	Construction quality (higher = better)
`ef_search`	50	Search quality (higher = more accurate)
`max_layer`	log(N)	Number of layers

Time Complexity¶

Construction: O(N log N * M)
Query: O(log N * M)
Space: O(N * M)

Performance Tuning¶

# Fast construction, good quality
router = Router(m=16, ef_construction=200)

# Slower construction, better quality
router = Router(m=32, ef_construction=400)

# Faster search
router = Router(ef_search=30)

# More accurate search
router = Router(ef_search=100)

When to Use¶

✅ Best for: - Large datasets (1K+ routes) - Sub-millisecond latency required - High-dimensional vectors (384+)

❌ Avoid when: - Very few routes (<10) - Need exact nearest neighbors - Memory constrained (<64MB)

BM25 (Best Matching 25)¶

Probabilistic information retrieval for keyword matching.

Formula¶

score(D, Q) = Σ IDF(qi) * (f(qi, D) * (k1 + 1)) / (f(qi, D) + k1 * (1 - b + b * |D|/avgdl))

where:
  D = document (route)
  Q = query
  qi = query term
  f(qi, D) = term frequency in D
  |D| = document length
  avgdl = average document length

Components¶

1. IDF (Inverse Document Frequency)¶

def idf(term, documents):
    df = count_documents_containing(term, documents)
    N = len(documents)
    return log((N - df + 0.5) / (df + 0.5) + 1)

Intuition: Rare terms are more discriminative

2. Term Frequency Saturation¶

def tf_component(tf, doc_length, avg_length, k1=1.5, b=0.75):
    normalized_length = doc_length / avg_length
    denominator = tf + k1 * (1 - b + b * normalized_length)
    return tf * (k1 + 1) / denominator

Intuition: More occurrences help, but with diminishing returns

Parameters¶

Parameter	Default	Effect
`k1`	1.5	Term frequency saturation (0 = binary, ∞ = raw TF)
`b`	0.75	Length normalization (0 = none, 1 = full)

Example¶

# Query: "invoice payment"
# Route: "Billing questions about invoices, payments, and refunds"

# Step 1: Tokenize
query_terms = ["invoice", "payment"]
doc_terms = ["billing", "questions", "about", "invoices", 
             "payments", "and", "refunds"]

# Step 2: Calculate IDF
idf("invoice") = log((100 - 20 + 0.5) / (20 + 0.5) + 1) = 1.58
idf("payment") = log((100 - 15 + 0.5) / (15 + 0.5) + 1) = 1.79

# Step 3: Calculate TF component
tf("invoice", doc) = 1
tf_component = 1 * (1.5 + 1) / (1 + 1.5 * (1 - 0.75 + 0.75 * 7/10))
             = 2.5 / 2.025 = 1.23

# Step 4: Sum scores
score = 1.58 * 1.23 + 1.79 * 1.23 = 4.14

Optimization¶

# Precompute IDF scores
idf_cache = {}
for term in vocabulary:
    idf_cache[term] = calculate_idf(term)

# Fast lookup during query
def score_query(query, doc):
    score = 0
    for term in query:
        if term in idf_cache:
            score += idf_cache[term] * tf_component(term, doc)
    return score

When to Use¶

✅ Best for: - Exact keyword matches needed - Short documents (route descriptions) - Domain-specific terminology

❌ Avoid when: - Semantic similarity more important - Paraphrasing common - No keywords available

Isotonic Regression¶

Monotonic calibration for confidence scores.

Problem¶

Raw scores don't match actual accuracy:

Raw Score → Actual Accuracy
0.60      → 0.71  (underconfident)
0.70      → 0.76  (underconfident)
0.80      → 0.88  (slightly under)
0.90      → 0.91  (well calibrated)

Solution¶

Fit a monotonic curve that maps scores to accuracy:

from sklearn.isotonic import IsotonicRegression

# Collect validation data
scores = [0.60, 0.65, 0.70, 0.75, 0.80, 0.85, 0.90, 0.95]
accuracies = [0.71, 0.74, 0.76, 0.80, 0.88, 0.90, 0.91, 0.97]

# Fit monotonic curve
calibrator = IsotonicRegression(out_of_bounds='clip')
calibrator.fit(scores, accuracies)

# Calibrate new scores
raw_score = 0.73
calibrated = calibrator.predict([raw_score])[0]  # 0.78

Algorithm¶

1. Sort (score, accuracy) pairs by score
2. Pool adjacent violators (PAV algorithm):
   - Scan from left to right
   - If non-monotonic, average adjacent points
   - Repeat until monotonic
3. Return piecewise constant function

PAV Algorithm¶

def pool_adjacent_violators(x, y):
    """Pool Adjacent Violators Algorithm"""
    n = len(x)
    solution = y.copy()

    i = 0
    while i < n - 1:
        if solution[i] > solution[i + 1]:
            # Violation found, pool
            pool_start = i
            pool_end = i + 1

            # Extend pool
            while pool_end < n and solution[pool_end - 1] > solution[pool_end]:
                pool_end += 1

            # Average pool
            pool_mean = sum(solution[pool_start:pool_end]) / (pool_end - pool_start)
            for j in range(pool_start, pool_end):
                solution[j] = pool_mean

            i = pool_end
        else:
            i += 1

    return solution

Per-Route Calibration¶

Each route gets its own calibration curve:

# Easy route (high baseline accuracy)
easy_calibrator = IsotonicRegression()
easy_calibrator.fit([0.6, 0.7, 0.8, 0.9], [0.82, 0.88, 0.93, 0.98])

# Hard route (lower baseline accuracy)
hard_calibrator = IsotonicRegression()
hard_calibrator.fit([0.6, 0.7, 0.8, 0.9], [0.65, 0.72, 0.81, 0.90])

# Apply appropriate calibration
if route == 'easy':
    confidence = easy_calibrator.predict([raw_score])
else:
    confidence = hard_calibrator.predict([raw_score])

Benefits¶

Accurate Confidence: Confidence = expected accuracy
Maintains Ranking: Doesn't change route order
Per-Route: Accounts for route difficulty
Fast: O(log N) lookup with binary search

When to Use¶

✅ Always use for production systems where confidence matters

❌ Skip only for prototyping or when speed is absolutely critical (<1ms)

SIMD Optimization¶

Hardware-accelerated vector operations for speed.

Overview¶

StrataRouter Core uses SIMD (Single Instruction Multiple Data) operations to accelerate similarity computations by up to 10x. The implementation is found in src/algorithms/simd_ops.rs and automatically detects the best available instruction set.

Supported Operations¶

// Cosine similarity with SIMD acceleration
pub fn cosine_similarity(a: &[f32], b: &[f32]) -> f32 {
    #[cfg(all(target_arch = "x86_64", target_feature = "avx2"))]
    {
        unsafe { cosine_similarity_avx2(a, b) }
    }

    #[cfg(not(all(target_arch = "x86_64", target_feature = "avx2")))]
    {
        cosine_similarity_scalar(a, b)
    }
}

// Scalar fallback for non-AVX2 systems
fn cosine_similarity_scalar(a: &[f32], b: &[f32]) -> f32 {
    let dot: f32 = a.iter().zip(b.iter()).map(|(x, y)| x * y).sum();
    let norm_a: f32 = a.iter().map(|x| x * x).sum::<f32>().sqrt();
    let norm_b: f32 = b.iter().map(|x| x * x).sum::<f32>().sqrt();
    dot / (norm_a * norm_b)
}

Performance Gain¶

Operation	Scalar	AVX2	NEON	Speedup
Dot Product (384D)	45ns	12ns	18ns	2.5-3.8x
Cosine Similarity	120ns	35ns	50ns	2.4-3.4x
Euclidean Distance	90ns	28ns	40ns	2.3-3.2x

Auto-Detection¶

// Automatically uses best available instruction set
let similarity = if is_x86_feature_detected!("avx2") {
    cosine_similarity_avx2(a, b)
} else if is_x86_feature_detected!("sse4.1") {
    cosine_similarity_sse(a, b)
} else {
    cosine_similarity_scalar(a, b)
}

Requirements¶

x86_64: AVX2 (2013+) or SSE4.1 (2006+)
ARM: NEON (most ARM CPUs)
Fallback: Scalar operations for all CPUs

Score Fusion¶

Combining multiple scores into a single confidence.

Weighted Average¶

# Learned weights
w_dense = 0.64
w_sparse = 0.29
w_rule = 0.07

# Fusion
fused = w_dense * dense_score + w_sparse * sparse_score + w_rule * rule_score

Normalization¶

# Sigmoid normalization
confidence = 1 / (1 + exp(-fused))

# Alternative: Min-max scaling
confidence = (fused - min_score) / (max_score - min_score)

# Alternative: Softmax (multi-route)
exp_scores = [exp(score) for score in all_scores]
confidence = exp(fused) / sum(exp_scores)

Learning Weights¶

from sklearn.linear_model import LogisticRegression

# Training data
X = [[dense1, sparse1, rule1],
     [dense2, sparse2, rule2],
     ...]
y = [correct1, correct2, ...]  # 1 if correct, 0 if wrong

# Train
model = LogisticRegression()
model.fit(X, y)

# Extract weights
w_dense, w_sparse, w_rule = model.coef_[0]

Comparison¶

Algorithm	Latency	Memory	Accuracy	Use Case
HNSW	0.8ms	50MB	95%	Semantic search
BM25	0.5ms	10MB	85%	Keyword matching
Rules	0.1ms	2MB	100%	Exact patterns
Isotonic	0.1ms	2MB	N/A	Calibration
SIMD	0.01ms	0MB	N/A	Acceleration

Next Steps¶

Routing Engine - How it all fits together
Performance - Benchmarks and optimization
API Reference - Implementation details