Routing Mechanics Guide
Routing Mechanics Guide
Overview
This guide explains how the Haya Routing Service determines the best route for a user query. It covers:
- Synonyms: How query expansion works
- Keywords: How keyword-based routing works
- Query Intent: How intent detection and boosting works
- Context: How context fields impact matching
- Weights: How field weights affect embeddings
- Confidence Scores: How confidence is calculated
Routing Flow
User Query
↓
1. Generate Query Embedding (with synonyms)
↓
2. Match Root Nodes (Level 0)
├─ Semantic Similarity
├─ Keyword Boosts/Penalties
├─ Query Intent Boosts
└─ Synonym Re-embedding
↓
3. Select Best Root Node
↓
4. Traverse to Children (Level 1)
├─ Semantic Matching
└─ Access Control
↓
5. Traverse to Collections (Level 2)
├─ Semantic Matching
├─ Access Control
└─ Final Route Selection
↓
6. Return Route with Confidence Score
Synonyms
What Are Synonyms?
Synonyms are word/phrase mappings that expand queries to improve semantic matching. They're stored at the node level (not in context) and are level-based.
Synonym Structure
{
query_synonyms: {
"show me": "find search locate get retrieve display list",
"how do i": "how to how can i how should i",
"troubleshoot": "troubleshooting fix resolve debug diagnose repair",
"documentation": "docs documents files records"
}
}
How Synonyms Work
- Extraction: Synonyms are extracted from nodes in the hierarchy path
- Merging: Synonyms from all nodes in the path are merged
- Query Expansion: Query is expanded with synonyms before embedding
- Re-embedding: Query is re-embedded with service-specific synonyms for better matching
Synonym Flow
// Step 1: Extract synonyms from hierarchy path
const synonyms = mergeSynonymsFromPath(hierarchyPath);
// Step 2: Expand query
const expandedQuery = expandQuery(query, synonyms);
// "show me servers" → "show me servers find search locate get retrieve display list servers"
// Step 3: Generate embedding with expanded query
const embedding = await embeddingService.embed({
text: expandedQuery,
text_type: 'query',
synonyms: synonyms
});
// Step 4: Re-embed with service-specific synonyms for better matching
const serviceSynonyms = extractSynonymsFromNode(rootNode);
const serviceEmbedding = await embeddingService.embed({
text: query,
text_type: 'query',
synonyms: serviceSynonyms
});
Synonym Impact
- Improves Matching: Expands query to match more variations
- Context-Aware: Uses synonyms from the hierarchy path
- Service-Specific: Re-embeds with service synonyms for better root node matching
- Confidence Boost: Can increase confidence by 15-25%
Example
// Query: "show me servers"
// Synonyms: { "show me": "find search locate get retrieve display list" }
// Expanded: "show me servers find search locate get retrieve display list servers"
// Result: Better match with "List all servers" in node description
Keywords
What Are Keywords?
Keywords are boost/penalty terms that adjust similarity scores based on query content. They're stored in service_context.routing_keywords (Level 0 only).
Keyword Structure
{
routing_keywords: {
boost_keywords: ["search", "find", "documentation", "docs"],
penalty_keywords: ["api", "endpoint", "sql", "database"],
boost_value: 0.2, // Default: 0.3
penalty_value: -0.15 // Default: -0.2
}
}
How Keywords Work
- Keyword Detection: Check if query contains boost/penalty keywords
- Boost Application: Add
boost_valueto similarity if boost keywords found - Penalty Application: Subtract
penalty_valuefrom similarity if penalty keywords found - Cross-Service Penalty: Apply penalty if other services' boost keywords are present
Keyword Flow
// Step 1: Get routing keywords from service context
const routingKeywords = serviceContext?.routing_keywords;
// Step 2: Check for boost keywords
const hasBoostKeywords = boostKeywords.some(keyword =>
queryLower.includes(keyword.toLowerCase())
);
// Step 3: Apply boost
if (hasBoostKeywords) {
similarity = Math.min(1.0, similarity + boostValue);
}
// Step 4: Check for penalty keywords
const hasPenaltyKeywords = penaltyKeywords.some(keyword =>
queryLower.includes(keyword.toLowerCase())
);
// Step 5: Apply penalty
if (hasPenaltyKeywords) {
similarity = Math.max(0, similarity + penaltyValue);
}
// Step 6: Cross-service penalty
// If other service's boost keywords are present, apply penalty
for (const otherService of otherServices) {
if (otherService.boostKeywords.some(k => query.includes(k))) {
if (!currentService.boostKeywords.includes(k)) {
similarity = Math.max(0, similarity + penaltyValue);
}
}
}
Keyword Impact
- Boosts Matching: Increases similarity when relevant keywords are found
- Prevents Mismatches: Decreases similarity when irrelevant keywords are found
- Cross-Service Logic: Prevents SQL from matching when REST/MCP keywords are present
- Configurable: Boost/penalty values are configurable per service
Example
// Query: "search for documentation"
// Service: Document Search Service
// Boost Keywords: ["search", "documentation"]
// Result: similarity += 0.2 (boost applied)
// Query: "query database using sql"
// Service: Document Search Service
// Penalty Keywords: ["sql", "database", "query"]
// Result: similarity -= 0.15 (penalty applied)
Query Intent
What Is Query Intent?
Query intent is the detected purpose of a query. It's used to apply boosts/penalties based on the query's intent.
Query Intent Types
- documentation: User wants documentation/search results
- data_query: User wants to query a database
- api_call: User wants to call a REST API
- mcp_config: User wants MCP protocol configuration
How Query Intent Works
- Intent Detection: Analyze query to detect intent
- Intent Boosts: Apply boosts/penalties based on intent and service type
- Hierarchy Override: Check for per-service overrides in hierarchy data
- Config Fallback: Fall back to config file if no hierarchy override
Query Intent Detection
// Intent detection logic
function detectQueryIntent(query: string): QueryIntent {
const queryLower = query.toLowerCase();
// Documentation intent
if (queryLower.includes('documentation') ||
queryLower.includes('docs') ||
queryLower.includes('guide') ||
queryLower.includes('how to')) {
return 'documentation';
}
// Data query intent
if (queryLower.includes('list') ||
queryLower.includes('count') ||
queryLower.includes('select') ||
queryLower.includes('query database')) {
return 'data_query';
}
// API call intent
if (queryLower.includes('api') ||
queryLower.includes('endpoint') ||
queryLower.includes('call')) {
return 'api_call';
}
// MCP config intent
if (queryLower.includes('mcp') ||
queryLower.includes('model context') ||
queryLower.includes('workflow')) {
return 'mcp_config';
}
return 'documentation'; // Default
}
Query Intent Boosts
Query intent boosts are applied after keyword boosts:
// Step 1: Try to get from hierarchy data (per-service override)
const serviceContext = getFieldAtLevel(rootNode, 'service_context');
const queryIntentBoosts = serviceContext?.query_intent_boosts;
if (queryIntentBoosts && queryIntent in queryIntentBoosts) {
intentBoostValue = queryIntentBoosts[queryIntent];
boostSource = 'hierarchy_data';
}
// Step 2: Fallback to config file
if (intentBoostValue === undefined) {
const boostConfig = configLoader.getBoostPenaltyConfig();
intentBoostValue = boostConfig.service_type_boosts.service_level[serviceType]?.[queryIntent];
boostSource = 'config_file';
}
// Step 3: Apply boost
if (intentBoostValue !== undefined) {
similarity = Math.min(1.0, similarity + intentBoostValue);
}
Query Intent Structure
// In hierarchy data (Level 0)
{
service_context: {
query_intent_boosts: {
documentation: 0.25, // Boost for documentation queries
data_query: -0.1, // Penalty for data queries
api_call: -0.15, // Penalty for API calls
mcp_config: 0.0 // Neutral for MCP config
}
}
}
Query Intent Impact
- Improves Accuracy: Boosts relevant services, penalizes irrelevant ones
- Service-Specific: Can override defaults per service
- Configurable: Can be set in hierarchy data or config file
- Applied After Keywords: Intent boosts are applied after keyword boosts
Example
// Query: "How do I troubleshoot network issues?"
// Intent: documentation
// Service: Document Search Service (SEARCH)
// Intent Boost: +0.25 (from hierarchy data)
// Result: similarity += 0.25
// Query: "List all servers"
// Intent: data_query
// Service: Document Search Service (SEARCH)
// Intent Boost: -0.1 (from hierarchy data)
// Result: similarity -= 0.1 (penalty)
Context
What Is Context?
Context is additional information about a node that helps with semantic matching. It's stored in level-specific context fields:
- Level 0:
service_context - Level 1:
category_context - Level 2:
collection_context
Context Fields
Service Context (Level 0)
{
service_context: {
detailed_description: string;
use_cases: Array<{
scenario: string;
example_queries: string[];
success_criteria?: string;
}>;
capabilities: {
what_it_can_do: string[];
what_it_cannot_do: string[];
strengths: string[];
limitations: string[];
};
domain_context: {
primary_domains: string[];
categories: string[];
subcategories: string[];
business_context: string;
};
query_patterns: {
good_matches: Array<{
pattern: string;
examples: string[];
confidence_boost?: number;
}>;
poor_matches: Array<{
pattern: string;
examples: string[];
reason: string;
}>;
};
routing_keywords: { ... };
query_intent_boosts: { ... };
}
}
Category Context (Level 1)
{
category_context: {
purpose: string;
description: string;
typical_queries: string[];
keywords: string[];
}
}
Collection Context (Level 2)
{
collection_context: {
purpose: string;
description: string;
document_types: string[];
content_characteristics: {
topics_covered: string[];
content_language: string[];
content_format: string[];
update_frequency: string;
coverage: string;
};
}
}
How Context Works
Context fields are included in embedding generation with specific weights:
// Level 0 embedding includes:
// - service_context.detailed_description (weight: 0.06)
// - service_context.use_cases (weight: 0.28) ← Highest weight
// - service_context.capabilities (weight: 0.18)
// - service_context.domain_context (weight: 0.20)
// - service_context.query_patterns (weight: varies)
// Level 1 embedding includes:
// - category_context.purpose (weight: 0.10)
// - category_context.typical_queries (weight: 0.25) ← Highest weight
// - category_context.keywords (weight: 0.22)
// Level 2 embedding includes:
// - collection_context.purpose (weight: 0.12)
// - collection_context.document_types (weight: 0.18)
// - collection_context.content_characteristics (weight: 0.20) ← Highest weight
Context Impact
- Improves Matching: Rich context improves semantic matching
- Weighted: Different context fields have different weights
- Level-Specific: Context structure varies by level
- Configurable: Can exclude context fields via
ignore_fields
Weights
What Are Weights?
Weights determine how much each field contributes to the embedding. Higher weights mean the field has more influence on matching.
Weight Structure
Weights are configured in Qdrant (tenant/app/level-specific) with fallback to config file:
// Level 0 (Service) weights
{
"Service": {
"example_queries_weight": 0.28, // Highest weight
"domain_match_weight": 0.20,
"capabilities_positive_weight": 0.18,
"service_type_keywords_weight": 0.15,
"description_weight": 0.10,
"context_weight": 0.06,
"capabilities_negative_weight": -0.18, // Negative weight
"name_weight": 0.03
}
}
// Level 1 (Category) weights
{
"Category": {
"typical_queries_weight": 0.25, // Highest weight
"keywords_weight": 0.22,
"domain_match_weight": 0.18,
"example_queries_weight": 0.15,
"purpose_weight": 0.10,
"description_weight": 0.08,
"context_weight": 0.04,
"name_weight": 0.02
}
}
// Level 2 (Collection) weights
{
"Collection": {
"topics_covered_weight": 0.25, // Highest weight
"content_characteristics_weight": 0.20,
"document_types_weight": 0.18,
"description_weight": 0.15,
"purpose_weight": 0.12,
"domain_match_weight": 0.08,
"context_weight": 0.05,
"name_weight": 0.02
}
}
How Weights Work
- Weight Resolution: Get weights for tenant/app/level (from Qdrant or config)
- Text Extraction: Extract text from each field
- Weight Application: Apply weights to each text segment
- Embedding Generation: Generate embeddings with weighted texts
- Combined Embedding: Create weighted combination of embeddings
Weight Flow
// Step 1: Resolve weights
const weightConfig = weightResolver.resolveWeights(node, parent, tenantId, appId, level);
// Step 2: Calculate weights for each field
const result = embeddingGenerator.calculateWeights(node, weightConfig);
// Returns: { texts: string[], weights: number[] }
// Step 3: Generate embeddings for each text
const embeddings = await embeddingService.batchEmbed(result.texts, 'document');
// Step 4: Create weighted combination
const combinedEmbedding = embeddingGenerator.createWeightedEmbedding(
embeddings,
result.weights
);
Weight Impact
- Field Importance: Higher weights = more influence on matching
- Negative Weights: Can reduce similarity (e.g., capabilities_negative_weight)
- Configurable: Weights can be overridden per tenant/app/level
- Dynamic: Weights are resolved at runtime
Example
// Node: Document Search Service
// Fields and weights:
// - use_cases: 0.28 (highest - most important)
// - domain_context: 0.20
// - capabilities: 0.18
// - description: 0.10
// - name: 0.03 (lowest - least important)
// Query: "How do I troubleshoot network issues?"
// Matching:
// - use_cases: High match (0.28 weight) → Strong contribution
// - domain_context: Medium match (0.20 weight) → Medium contribution
// - description: Low match (0.10 weight) → Weak contribution
// Result: High overall similarity due to use_cases match
Confidence Scores
What Is Confidence?
Confidence is a score (0.0 to 1.0) that indicates how confident the system is that a route is correct.
How Confidence Is Calculated
Confidence is calculated based on:
- Similarity Score: Base similarity from semantic matching
- High Confidence Threshold: Minimum similarity for high confidence
- Clear Gap: Difference between top and second match
- Clear Winner Bonus: Bonus for clear winner
- Candidate Count: Fewer candidates = higher confidence
Confidence Calculation
function calculateConfidence(
results: SemanticMatchResult[],
level?: number,
tenantId?: string,
appId?: string
): number {
if (results.length === 0) return 0;
const topScore = results[0].weighted_score;
const confidenceConfig = configLoader.getConfidenceConfig(tenantId, appId, level);
const highConfidenceThreshold = confidenceConfig.confidence_calculation.high_confidence_threshold;
const clearGapThreshold = confidenceConfig.confidence_calculation.clear_gap_threshold;
const clearWinnerBonus = confidenceConfig.confidence_calculation.clear_winner_bonus;
// Base confidence is the similarity score
let confidence = topScore;
// Apply confidence calculation if similarity exceeds threshold
if (topScore > highConfidenceThreshold) {
// Boost if fewer candidates (more focused matching)
if (results.length <= 3) {
confidence = Math.min(1.0, confidence + 0.05);
} else if (results.length <= 5) {
confidence = Math.min(1.0, confidence + 0.02);
}
// Apply clear winner bonus if single candidate
if (results.length === 1) {
confidence = Math.min(1.0, confidence + clearWinnerBonus);
}
// Check gap between top and second
if (results.length > 1) {
const secondScore = results[1].weighted_score;
const gap = topScore - secondScore;
if (gap > clearGapThreshold) {
confidence = Math.min(1.0, confidence + clearWinnerBonus);
}
}
}
return confidence;
}
Confidence Configuration
Confidence thresholds are configurable per tenant/app/level:
{
confidence_calculation: {
high_confidence_threshold: 0.7, // Minimum for high confidence
clear_gap_threshold: 0.15, // Gap between top and second
clear_winner_bonus: 0.1 // Bonus for clear winner
},
confidence_thresholds: {
level_0: 0.2, // Minimum confidence for Level 0
level_1: 0.3, // Minimum confidence for Level 1
level_2: 0.4 // Minimum confidence for Level 2
}
}
Confidence Impact
- Routing Decision: Routes below threshold are rejected
- Level-Specific: Different thresholds for different levels
- Configurable: Thresholds can be overridden per tenant/app/level
- Dynamic: Confidence can degrade over time (cache)
Example
// Query: "How do I troubleshoot network issues?"
// Matches:
// 1. Network Troubleshooting: 0.90 (top)
// 2. Login Troubleshooting: 0.60 (second)
// Gap: 0.30 (> 0.15 threshold)
// High confidence threshold: 0.70 (< 0.90)
// Clear winner bonus: +0.1
// Result: confidence = 0.90 + 0.1 = 1.0 (capped at 1.0)
Complete Routing Example
Query: "How do I troubleshoot network issues?"
Step 1: Query Embedding
// Original query
query = "How do I troubleshoot network issues?"
// Extract synonyms from hierarchy
synonyms = {
"how do i": "how to how can i how should i",
"troubleshoot": "troubleshooting fix resolve debug diagnose repair"
}
// Expand query
expandedQuery = "How do I troubleshoot network issues? how to how can i how should i troubleshooting fix resolve debug diagnose repair network issues?"
// Generate embedding
queryEmbedding = await embeddingService.embed({
text: expandedQuery,
text_type: 'query',
synonyms: synonyms
});
Step 2: Root Node Matching
// Match against root nodes
matches = [
{ node: documentSearchService, similarity: 0.75 },
{ node: sqlDatabaseService, similarity: 0.30 },
{ node: restApiService, similarity: 0.25 }
];
// Apply keyword boost
// Query contains "troubleshoot" → boost keyword for document-search-service
documentSearchService.similarity += 0.2; // 0.75 → 0.95
// Apply query intent boost
// Intent: documentation
// document-search-service: +0.25
documentSearchService.similarity += 0.25; // 0.95 → 1.0 (capped)
// Re-embed with service synonyms
serviceSynonyms = extractSynonymsFromNode(documentSearchService);
serviceEmbedding = await embeddingService.embed({
text: query,
synonyms: serviceSynonyms
});
// Re-calculate similarity: 0.98
// Select best match
selectedRoot = documentSearchService; // similarity: 0.98
Step 3: Category Matching
// Get children of document-search-service
categories = [
troubleshootingCategory,
policyDocumentsCategory
];
// Match against categories
categoryMatches = [
{ node: troubleshootingCategory, similarity: 0.85 },
{ node: policyDocumentsCategory, similarity: 0.40 }
];
// Select best match
selectedCategory = troubleshootingCategory; // similarity: 0.85
Step 4: Collection Matching
// Get children of troubleshooting-category
collections = [
networkTroubleshooting,
loginTroubleshooting
];
// Match against collections
collectionMatches = [
{ node: networkTroubleshooting, similarity: 0.90 },
{ node: loginTroubleshooting, similarity: 0.60 }
];
// Calculate confidence
confidence = calculateConfidence(collectionMatches);
// topScore: 0.90
// gap: 0.30 (> 0.15 threshold)
// clearWinnerBonus: +0.1
// confidence: 0.90 + 0.1 = 1.0
// Select best match
selectedCollection = networkTroubleshooting; // similarity: 0.90, confidence: 1.0
Step 5: Return Route
return {
routes: [{
source_id: "network-troubleshooting",
source_type: "vector_db",
service_type: "SEARCH",
source_name: "Network Troubleshooting",
connection_info: { ... },
confidence_score: 1.0,
ranking_score: 0.90
}],
primary_route: { ... },
confidence: 1.0,
decision_path: "HIERARCHY_TRAVERSAL"
};
Summary
Synonyms
- Purpose: Expand queries to improve matching
- Location: Node level (
query_synonyms) - Impact: 15-25% confidence boost
- Flow: Extract → Merge → Expand → Re-embed
Keywords
- Purpose: Boost/penalty based on query content
- Location:
service_context.routing_keywords(Level 0) - Impact: ±0.2 to ±0.3 similarity adjustment
- Flow: Detect → Boost/Penalty → Cross-service penalty
Query Intent
- Purpose: Boost/penalty based on query intent
- Location:
service_context.query_intent_boosts(Level 0) - Impact: ±0.1 to ±0.25 similarity adjustment
- Flow: Detect → Override → Apply boost
Context
- Purpose: Rich information for semantic matching
- Location: Level-specific context fields
- Impact: Improves embedding quality
- Flow: Extract → Weight → Embed
Weights
- Purpose: Control field importance in embeddings
- Location: Qdrant config (tenant/app/level)
- Impact: Determines which fields matter most
- Flow: Resolve → Apply → Combine
Confidence Scores
- Purpose: Indicate route correctness
- Location: Calculated from similarity
- Impact: Determines if route is accepted
- Flow: Calculate → Threshold → Decision
Best Practices
- Synonyms: Add comprehensive synonyms for common query variations
- Keywords: Use boost keywords for service-specific terms, penalty keywords for irrelevant terms
- See Configuration Guide - Routing Keywords for detailed instructions
- Query Intent: Set appropriate boosts/penalties for each intent type
- Context: Provide rich context in descriptions and use cases
- Weights: Adjust weights based on which fields are most important
- See Configuration Guide - Embedding Weights for detailed instructions
Related Documentation
- Configuration Guide - How to update keywords, weights, and confidence settings
- Test Script Guide - How to test routing changes
- Hierarchy Data Guide - Hierarchy structure details