Skip to content

Frequently Asked Questions


Getting Started

What is StrataRouter?

StrataRouter is a high-performance semantic routing engine for LLM agents and RAG pipelines. It intelligently routes queries to the optimal destination based on meaning — not just keywords — with sub-10ms latency and 95%+ accuracy.

How is it different from semantic-router or llamaindex?

StrataRouter combines three innovations unavailable in alternatives:

  1. Hybrid Scoring — Dense embeddings + sparse BM25 + rule-based signals (not just embeddings)
  2. Speed — 20–40× faster (8.7ms vs 178ms+ average P99)
  3. Accuracy — 95.4% vs 84–85% for alternatives
  4. Production-Ready — Semantic caching, batching, and observability built-in

Can I use it with my LLM provider?

Yes. StrataRouter works with any embedding source and any LLM provider:

  • OpenAI (GPT-4, GPT-5, text-embedding-3)
  • Anthropic (Claude 4.5, Claude 3 series)
  • Google (Gemini 3.1, text-embedding-004)
  • Local models (Ollama, vLLM, sentence-transformers)
  • Custom (any REST API)

Do I need to know Rust?

No. StrataRouter provides Python bindings that are just as fast as the Rust API. You only need Python.

What do I need to get started?

  • Python 3.8+ (3.9+ recommended)
  • 5 minutes
  • pip install stratarouter

Installation & Setup

Which Python version do I need?

Python 3.8 or higher. We recommend 3.9+.

python --version

Should I use a virtual environment?

Yes — it prevents dependency conflicts:

python -m venv venv
source venv/bin/activate  # macOS/Linux
venv\Scripts\activate      # Windows
pip install stratarouter

What if pip install fails?

Try in order:

  1. Upgrade pip: pip install --upgrade pip
  2. Clear cache: pip install --no-cache-dir stratarouter
  3. Linux system packages: sudo apt-get install python3-dev build-essential
  4. Open an issue on GitHub

Can I build from source?

Yes, if you need the latest features:

git clone https://github.com/agentdyne9/stratarouter
cd stratarouter
pip install -e ".[dev]"

Building from source requires Rust. See the Installation Guide for full instructions.

How do I verify the installation?

python -c "import stratarouter; print(stratarouter.__version__)"

Usage & Configuration

How many routes can I have?

Technically unlimited, but practically:

  • Best (3–50): Optimal accuracy and speed
  • Good (50–200): Still very fast
  • Fine (200+): Latency increases gradually

For 1000+ routes, use hierarchical routing (broad category → specific sub-route).

How do I write good route descriptions?

Be specific and include related concepts:

# Good — specific, detailed
Route(
    id="billing_refunds",
    description="Handle refund requests, payment disputes, and chargebacks",
    keywords=["refund", "dispute", "chargeback", "return"],
    examples=["I want a refund", "Dispute this charge", "Cancel and refund"]
)

# Bad — vague, no examples
Route(id="stuff", description="Various things", keywords=["help"])

What threshold should I use?

Threshold Behavior Use Case
0.3–0.5 Permissive Exploratory, high recall
0.5–0.7 Balanced Production default
0.7–0.9 Conservative High precision required
0.9+ Strict Critical decisions only

Start at 0.7 and tune based on your results.

Can I use custom embedding models?

Yes — provide embeddings from any model:

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("all-mpnet-base-v2")  # 768-dim, higher accuracy
router = Router(dimension=768)  # Match model dimension
embeddings = model.encode(descriptions)
router.build_index(embeddings)

How do I handle low confidence scores?

  1. Improve descriptions — make them more specific and distinct
  2. Lower threshold — Router(threshold=0.5) instead of 0.7
  3. Use a better model — all-mpnet-base-v2 vs all-MiniLM-L6-v2
  4. Add more examples to routes

If a query consistently gets low confidence, it likely doesn't belong to any defined route — use a fallback handler.

What about multilingual queries?

Use a multilingual embedding model:

from sentence_transformers import SentenceTransformer
model = SentenceTransformer("paraphrase-multilingual-mpnet-base-v2")
router = Router(dimension=768)

Supports English, Spanish, French, German, Chinese, Japanese, and 50+ languages.


Performance & Optimization

Why is my router slow?

StrataRouter routes in under 10ms. If it's slower:

  1. Did you call router.build_index()? (required before routing)
  2. Reduce the number of routes
  3. Use smaller embedding dimension (384 vs 768)
  4. Enable SIMD: Router(enable_simd=True)

Debug with:

result = router.route(query)
print(f"Latency: {result.latency_ms}ms")

Can I use a GPU?

Yes, for embedding computation:

from sentence_transformers import SentenceTransformer
model = SentenceTransformer("all-MiniLM-L6-v2")
model = model.to("cuda")  # Use GPU for embeddings

Routing itself is CPU-optimized with SIMD vectorization.

How do I improve accuracy?

  1. More specific route descriptions
  2. Add 5–10 examples per route
  3. Use a higher-quality embedding model
  4. Lower threshold to reduce false negatives
  5. Add keywords for exact-term matching

What's the typical accuracy?

95.4% on standard benchmarks. Your actual accuracy depends on route description quality, query clarity, and threshold setting.


Integration & Deployment

How do I use it with LangChain?

See the LangChain Integration Guide. Quick example:

from stratarouter.integrations.langchain import StrataRouterChain

router_chain = StrataRouterChain(router)
result = router_chain.route("Where's my invoice?")

Can I deploy to production?

Yes. StrataRouter is production-ready. See Production Deployment for Docker, Kubernetes, AWS ECS, and bare-metal configurations.

How do I monitor routing decisions?

result = router.route(query)
print(f"Route: {result.route_id}")
print(f"Score: {result.confidence:.3f}")
print(f"Latency: {result.latency_ms:.2f}ms")

For production, enable Prometheus metrics and OpenTelemetry tracing. See Monitoring Guide.

Is it cost-effective?

Yes. Key cost benefits:

  • Semantic Caching — 70–80% LLM cost reduction through cache hits
  • Model Selection — Route simple queries to cheaper models
  • Batch Processing — 3–5× throughput improvement reduces per-query cost

Typical ROI: pays for itself within weeks.


Features

Does it support semantic caching?

Yes:

config = RuntimeConfig(cache_enabled=True, cache_backend="redis")
bridge = CoreRuntimeBridge(config)

Typical production hit rate: 85%+.

What LLM providers are supported?

  • OpenAI (GPT-4, GPT-5, all series)
  • Anthropic (Claude 4.5, Claude 3 series)
  • Google (Gemini 3.1, all series)
  • Local models (Ollama, vLLM, LM Studio)
  • Custom providers via REST API

Can I use my own embeddings?

Yes:

embeddings = your_encoder.encode(route_descriptions)
router.build_index(embeddings)

Enterprise

Is there enterprise support?

Yes. Enterprise customers receive:

  • 24/7 priority support with SLA guarantees
  • Dedicated Customer Success Manager
  • HIPAA/GDPR/SOC 2 compliance pack
  • Multi-tenancy and SSO/SAML
  • Custom integration development

Contact: enterprise@stratarouter.dev

Is it HIPAA / GDPR compliant?

Yes. StrataRouter Enterprise supports HIPAA, GDPR, SOC 2 Type II, and ISO 27001. See Compliance.

What's the pricing?

  • Open Source: Free (MIT license) — full Core and Runtime
  • Enterprise: Custom pricing based on routing volume, compliance needs, and support tier

See Pricing for the full feature comparison.


Migration

How do I migrate from semantic-router?

See Migration Guide for step-by-step instructions.

Can I import existing routes?

Yes:

from stratarouter import Router
router = Router.from_semantic_router("routes.json")

Troubleshooting

ImportError: No module named 'stratarouter'

pip install stratarouter
python -c "import stratarouter"  # Verify

If still failing, check you're in the correct virtual environment:

which python
pip list | grep stratarouter

ValueError: dimension mismatch

Your embedding dimension doesn't match the router configuration:

# Correct
router = Router(dimension=384)
embeddings = model.encode(texts)  # Must return 384-dim vectors

Routes aren't matching correctly

  1. Better descriptions — make them specific and distinct
  2. Lower threshold — try 0.5 instead of 0.7
  3. Print all scores for debugging:
results = router.route(query, return_all_scores=True)
for route_id, score in results:
    print(f"{route_id}: {score:.3f}")

Memory usage is high

  1. Reduce embedding dimension — use 384 instead of 768
  2. Delete unused routes
  3. Use all-MiniLM-L6-v2 instead of all-mpnet-base-v2
  4. Reduce HNSW M parameter

Licensing

What license is StrataRouter under?

MIT License — free for commercial use, no restrictions.

Can I use it in my SaaS product?

Yes, no restrictions. You do not need to open-source your application code.


Community

How do I get help?

Can I contribute?

Yes! See the Contributing Guide.


Still have questions? Ask on GitHub Discussions or join our Discord.