Frequently Asked Questions¶
Getting Started¶
What is StrataRouter?
StrataRouter is a high-performance semantic routing engine for LLM agents and RAG pipelines. It intelligently routes queries to the optimal destination based on meaning — not just keywords — with sub-10ms latency and 95%+ accuracy.
How is it different from semantic-router or llamaindex?
StrataRouter combines three innovations unavailable in alternatives:
- Hybrid Scoring — Dense embeddings + sparse BM25 + rule-based signals (not just embeddings)
- Speed — 20–40× faster (8.7ms vs 178ms+ average P99)
- Accuracy — 95.4% vs 84–85% for alternatives
- Production-Ready — Semantic caching, batching, and observability built-in
Can I use it with my LLM provider?
Yes. StrataRouter works with any embedding source and any LLM provider:
- OpenAI (GPT-4, GPT-5, text-embedding-3)
- Anthropic (Claude 4.5, Claude 3 series)
- Google (Gemini 3.1, text-embedding-004)
- Local models (Ollama, vLLM, sentence-transformers)
- Custom (any REST API)
Do I need to know Rust?
No. StrataRouter provides Python bindings that are just as fast as the Rust API. You only need Python.
What do I need to get started?
- Python 3.8+ (3.9+ recommended)
- 5 minutes
pip install stratarouter
Installation & Setup¶
Which Python version do I need?
Python 3.8 or higher. We recommend 3.9+.
Should I use a virtual environment?
Yes — it prevents dependency conflicts:
python -m venv venv
source venv/bin/activate # macOS/Linux
venv\Scripts\activate # Windows
pip install stratarouter
What if pip install fails?
Try in order:
- Upgrade pip:
pip install --upgrade pip - Clear cache:
pip install --no-cache-dir stratarouter - Linux system packages:
sudo apt-get install python3-dev build-essential - Open an issue on GitHub
Can I build from source?
Yes, if you need the latest features:
Building from source requires Rust. See the Installation Guide for full instructions.
How do I verify the installation?
Usage & Configuration¶
How many routes can I have?
Technically unlimited, but practically:
- Best (3–50): Optimal accuracy and speed
- Good (50–200): Still very fast
- Fine (200+): Latency increases gradually
For 1000+ routes, use hierarchical routing (broad category → specific sub-route).
How do I write good route descriptions?
Be specific and include related concepts:
# Good — specific, detailed
Route(
id="billing_refunds",
description="Handle refund requests, payment disputes, and chargebacks",
keywords=["refund", "dispute", "chargeback", "return"],
examples=["I want a refund", "Dispute this charge", "Cancel and refund"]
)
# Bad — vague, no examples
Route(id="stuff", description="Various things", keywords=["help"])
What threshold should I use?
| Threshold | Behavior | Use Case |
|---|---|---|
| 0.3–0.5 | Permissive | Exploratory, high recall |
| 0.5–0.7 | Balanced | Production default |
| 0.7–0.9 | Conservative | High precision required |
| 0.9+ | Strict | Critical decisions only |
Start at 0.7 and tune based on your results.
Can I use custom embedding models?
Yes — provide embeddings from any model:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("all-mpnet-base-v2") # 768-dim, higher accuracy
router = Router(dimension=768) # Match model dimension
embeddings = model.encode(descriptions)
router.build_index(embeddings)
How do I handle low confidence scores?
- Improve descriptions — make them more specific and distinct
- Lower threshold —
Router(threshold=0.5)instead of0.7 - Use a better model —
all-mpnet-base-v2vsall-MiniLM-L6-v2 - Add more examples to routes
If a query consistently gets low confidence, it likely doesn't belong to any defined route — use a fallback handler.
What about multilingual queries?
Use a multilingual embedding model:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("paraphrase-multilingual-mpnet-base-v2")
router = Router(dimension=768)
Supports English, Spanish, French, German, Chinese, Japanese, and 50+ languages.
Performance & Optimization¶
Why is my router slow?
StrataRouter routes in under 10ms. If it's slower:
- Did you call
router.build_index()? (required before routing) - Reduce the number of routes
- Use smaller embedding dimension (384 vs 768)
- Enable SIMD:
Router(enable_simd=True)
Debug with:
Can I use a GPU?
Yes, for embedding computation:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("all-MiniLM-L6-v2")
model = model.to("cuda") # Use GPU for embeddings
Routing itself is CPU-optimized with SIMD vectorization.
How do I improve accuracy?
- More specific route descriptions
- Add 5–10 examples per route
- Use a higher-quality embedding model
- Lower threshold to reduce false negatives
- Add keywords for exact-term matching
What's the typical accuracy?
95.4% on standard benchmarks. Your actual accuracy depends on route description quality, query clarity, and threshold setting.
Integration & Deployment¶
How do I use it with LangChain?
See the LangChain Integration Guide. Quick example:
from stratarouter.integrations.langchain import StrataRouterChain
router_chain = StrataRouterChain(router)
result = router_chain.route("Where's my invoice?")
Can I deploy to production?
Yes. StrataRouter is production-ready. See Production Deployment for Docker, Kubernetes, AWS ECS, and bare-metal configurations.
How do I monitor routing decisions?
result = router.route(query)
print(f"Route: {result.route_id}")
print(f"Score: {result.confidence:.3f}")
print(f"Latency: {result.latency_ms:.2f}ms")
For production, enable Prometheus metrics and OpenTelemetry tracing. See Monitoring Guide.
Is it cost-effective?
Yes. Key cost benefits:
- Semantic Caching — 70–80% LLM cost reduction through cache hits
- Model Selection — Route simple queries to cheaper models
- Batch Processing — 3–5× throughput improvement reduces per-query cost
Typical ROI: pays for itself within weeks.
Features¶
Does it support semantic caching?
Yes:
config = RuntimeConfig(cache_enabled=True, cache_backend="redis")
bridge = CoreRuntimeBridge(config)
Typical production hit rate: 85%+.
What LLM providers are supported?
- OpenAI (GPT-4, GPT-5, all series)
- Anthropic (Claude 4.5, Claude 3 series)
- Google (Gemini 3.1, all series)
- Local models (Ollama, vLLM, LM Studio)
- Custom providers via REST API
Can I use my own embeddings?
Yes:
Enterprise¶
Is there enterprise support?
Yes. Enterprise customers receive:
- 24/7 priority support with SLA guarantees
- Dedicated Customer Success Manager
- HIPAA/GDPR/SOC 2 compliance pack
- Multi-tenancy and SSO/SAML
- Custom integration development
Contact: enterprise@stratarouter.dev
Is it HIPAA / GDPR compliant?
Yes. StrataRouter Enterprise supports HIPAA, GDPR, SOC 2 Type II, and ISO 27001. See Compliance.
What's the pricing?
- Open Source: Free (MIT license) — full Core and Runtime
- Enterprise: Custom pricing based on routing volume, compliance needs, and support tier
See Pricing for the full feature comparison.
Migration¶
How do I migrate from semantic-router?
See Migration Guide for step-by-step instructions.
Can I import existing routes?
Yes:
Troubleshooting¶
ImportError: No module named 'stratarouter'
If still failing, check you're in the correct virtual environment:
ValueError: dimension mismatch
Your embedding dimension doesn't match the router configuration:
# Correct
router = Router(dimension=384)
embeddings = model.encode(texts) # Must return 384-dim vectors
Routes aren't matching correctly
- Better descriptions — make them specific and distinct
- Lower threshold — try
0.5instead of0.7 - Print all scores for debugging:
results = router.route(query, return_all_scores=True)
for route_id, score in results:
print(f"{route_id}: {score:.3f}")
Memory usage is high
- Reduce embedding dimension — use 384 instead of 768
- Delete unused routes
- Use
all-MiniLM-L6-v2instead ofall-mpnet-base-v2 - Reduce HNSW M parameter
Licensing¶
What license is StrataRouter under?
MIT License — free for commercial use, no restrictions.
Can I use it in my SaaS product?
Yes, no restrictions. You do not need to open-source your application code.
Community¶
How do I get help?
- Discord — Community chat
- GitHub Discussions — Q&A
- GitHub Issues — Bug reports
Can I contribute?
Yes! See the Contributing Guide.
Still have questions? Ask on GitHub Discussions or join our Discord.