Routing Mechanics Guide

Overview

This guide explains how the Haya Routing Service determines the best route for a user query. It covers:

Synonyms: How query expansion works
Keywords: How keyword-based routing works
Query Intent: How intent detection and boosting works
Context: How context fields impact matching
Weights: How field weights affect embeddings
Confidence Scores: How confidence is calculated

Routing Flow

User Query
    ↓
1. Generate Query Embedding (with synonyms)
    ↓
2. Match Root Nodes (Level 0)
    ├─ Semantic Similarity
    ├─ Keyword Boosts/Penalties
    ├─ Query Intent Boosts
    └─ Synonym Re-embedding
    ↓
3. Select Best Root Node
    ↓
4. Traverse to Children (Level 1)
    ├─ Semantic Matching
    └─ Access Control
    ↓
5. Traverse to Collections (Level 2)
    ├─ Semantic Matching
    ├─ Access Control
    └─ Final Route Selection
    ↓
6. Return Route with Confidence Score

Synonyms

What Are Synonyms?

Synonyms are word/phrase mappings that expand queries to improve semantic matching. They're stored at the node level (not in context) and are level-based.

Synonym Structure

{
  query_synonyms: {
    "show me": "find search locate get retrieve display list",
    "how do i": "how to how can i how should i",
    "troubleshoot": "troubleshooting fix resolve debug diagnose repair",
    "documentation": "docs documents files records"
  }
}

How Synonyms Work

Extraction: Synonyms are extracted from nodes in the hierarchy path
Merging: Synonyms from all nodes in the path are merged
Query Expansion: Query is expanded with synonyms before embedding
Re-embedding: Query is re-embedded with service-specific synonyms for better matching

Synonym Flow

// Step 1: Extract synonyms from hierarchy path
const synonyms = mergeSynonymsFromPath(hierarchyPath);

// Step 2: Expand query
const expandedQuery = expandQuery(query, synonyms);
// "show me servers" → "show me servers find search locate get retrieve display list servers"

// Step 3: Generate embedding with expanded query
const embedding = await embeddingService.embed({
  text: expandedQuery,
  text_type: 'query',
  synonyms: synonyms
});

// Step 4: Re-embed with service-specific synonyms for better matching
const serviceSynonyms = extractSynonymsFromNode(rootNode);
const serviceEmbedding = await embeddingService.embed({
  text: query,
  text_type: 'query',
  synonyms: serviceSynonyms
});

Synonym Impact

Improves Matching: Expands query to match more variations
Context-Aware: Uses synonyms from the hierarchy path
Service-Specific: Re-embeds with service synonyms for better root node matching
Confidence Boost: Can increase confidence by 15-25%

Example

// Query: "show me servers"
// Synonyms: { "show me": "find search locate get retrieve display list" }
// Expanded: "show me servers find search locate get retrieve display list servers"
// Result: Better match with "List all servers" in node description

Keywords

What Are Keywords?

Keywords are boost/penalty terms that adjust similarity scores based on query content. They're stored in service_context.routing_keywords (Level 0 only).

Keyword Structure

{
  routing_keywords: {
    boost_keywords: ["search", "find", "documentation", "docs"],
    penalty_keywords: ["api", "endpoint", "sql", "database"],
    boost_value: 0.2,      // Default: 0.3
    penalty_value: -0.15    // Default: -0.2
  }
}

How Keywords Work

Keyword Detection: Check if query contains boost/penalty keywords
Boost Application: Add boost_value to similarity if boost keywords found
Penalty Application: Subtract penalty_value from similarity if penalty keywords found
Cross-Service Penalty: Apply penalty if other services' boost keywords are present

Keyword Flow

// Step 1: Get routing keywords from service context
const routingKeywords = serviceContext?.routing_keywords;

// Step 2: Check for boost keywords
const hasBoostKeywords = boostKeywords.some(keyword => 
  queryLower.includes(keyword.toLowerCase())
);

// Step 3: Apply boost
if (hasBoostKeywords) {
  similarity = Math.min(1.0, similarity + boostValue);
}

// Step 4: Check for penalty keywords
const hasPenaltyKeywords = penaltyKeywords.some(keyword => 
  queryLower.includes(keyword.toLowerCase())
);

// Step 5: Apply penalty
if (hasPenaltyKeywords) {
  similarity = Math.max(0, similarity + penaltyValue);
}

// Step 6: Cross-service penalty
// If other service's boost keywords are present, apply penalty
for (const otherService of otherServices) {
  if (otherService.boostKeywords.some(k => query.includes(k))) {
    if (!currentService.boostKeywords.includes(k)) {
      similarity = Math.max(0, similarity + penaltyValue);
    }
  }
}

Keyword Impact

Boosts Matching: Increases similarity when relevant keywords are found
Prevents Mismatches: Decreases similarity when irrelevant keywords are found
Cross-Service Logic: Prevents SQL from matching when REST/MCP keywords are present
Configurable: Boost/penalty values are configurable per service

Example

// Query: "search for documentation"
// Service: Document Search Service
// Boost Keywords: ["search", "documentation"]
// Result: similarity += 0.2 (boost applied)

// Query: "query database using sql"
// Service: Document Search Service
// Penalty Keywords: ["sql", "database", "query"]
// Result: similarity -= 0.15 (penalty applied)

Query Intent

What Is Query Intent?

Query intent is the detected purpose of a query. It's used to apply boosts/penalties based on the query's intent.

Query Intent Types

documentation: User wants documentation/search results
data_query: User wants to query a database
api_call: User wants to call a REST API
mcp_config: User wants MCP protocol configuration

How Query Intent Works

Intent Detection: Analyze query to detect intent
Intent Boosts: Apply boosts/penalties based on intent and service type
Hierarchy Override: Check for per-service overrides in hierarchy data
Config Fallback: Fall back to config file if no hierarchy override

Query Intent Detection

// Intent detection logic
function detectQueryIntent(query: string): QueryIntent {
  const queryLower = query.toLowerCase();
  
  // Documentation intent
  if (queryLower.includes('documentation') || 
      queryLower.includes('docs') ||
      queryLower.includes('guide') ||
      queryLower.includes('how to')) {
    return 'documentation';
  }
  
  // Data query intent
  if (queryLower.includes('list') ||
      queryLower.includes('count') ||
      queryLower.includes('select') ||
      queryLower.includes('query database')) {
    return 'data_query';
  }
  
  // API call intent
  if (queryLower.includes('api') ||
      queryLower.includes('endpoint') ||
      queryLower.includes('call')) {
    return 'api_call';
  }
  
  // MCP config intent
  if (queryLower.includes('mcp') ||
      queryLower.includes('model context') ||
      queryLower.includes('workflow')) {
    return 'mcp_config';
  }
  
  return 'documentation'; // Default
}

Query Intent Boosts

Query intent boosts are applied after keyword boosts:

// Step 1: Try to get from hierarchy data (per-service override)
const serviceContext = getFieldAtLevel(rootNode, 'service_context');
const queryIntentBoosts = serviceContext?.query_intent_boosts;

if (queryIntentBoosts && queryIntent in queryIntentBoosts) {
  intentBoostValue = queryIntentBoosts[queryIntent];
  boostSource = 'hierarchy_data';
}

// Step 2: Fallback to config file
if (intentBoostValue === undefined) {
  const boostConfig = configLoader.getBoostPenaltyConfig();
  intentBoostValue = boostConfig.service_type_boosts.service_level[serviceType]?.[queryIntent];
  boostSource = 'config_file';
}

// Step 3: Apply boost
if (intentBoostValue !== undefined) {
  similarity = Math.min(1.0, similarity + intentBoostValue);
}

Query Intent Structure

// In hierarchy data (Level 0)
{
  service_context: {
    query_intent_boosts: {
      documentation: 0.25,    // Boost for documentation queries
      data_query: -0.1,       // Penalty for data queries
      api_call: -0.15,        // Penalty for API calls
      mcp_config: 0.0         // Neutral for MCP config
    }
  }
}

Query Intent Impact

Improves Accuracy: Boosts relevant services, penalizes irrelevant ones
Service-Specific: Can override defaults per service
Configurable: Can be set in hierarchy data or config file
Applied After Keywords: Intent boosts are applied after keyword boosts

Example

// Query: "How do I troubleshoot network issues?"
// Intent: documentation
// Service: Document Search Service (SEARCH)
// Intent Boost: +0.25 (from hierarchy data)
// Result: similarity += 0.25

// Query: "List all servers"
// Intent: data_query
// Service: Document Search Service (SEARCH)
// Intent Boost: -0.1 (from hierarchy data)
// Result: similarity -= 0.1 (penalty)

Context

What Is Context?

Context is additional information about a node that helps with semantic matching. It's stored in level-specific context fields:

Level 0: service_context
Level 1: category_context
Level 2: collection_context

Context Fields

Service Context (Level 0)

{
  service_context: {
    detailed_description: string;
    use_cases: Array<{
      scenario: string;
      example_queries: string[];
      success_criteria?: string;
    }>;
    capabilities: {
      what_it_can_do: string[];
      what_it_cannot_do: string[];
      strengths: string[];
      limitations: string[];
    };
    domain_context: {
      primary_domains: string[];
      categories: string[];
      subcategories: string[];
      business_context: string;
    };
    query_patterns: {
      good_matches: Array<{
        pattern: string;
        examples: string[];
        confidence_boost?: number;
      }>;
      poor_matches: Array<{
        pattern: string;
        examples: string[];
        reason: string;
      }>;
    };
    routing_keywords: { ... };
    query_intent_boosts: { ... };
  }
}

Category Context (Level 1)

{
  category_context: {
    purpose: string;
    description: string;
    typical_queries: string[];
    keywords: string[];
  }
}

Collection Context (Level 2)

{
  collection_context: {
    purpose: string;
    description: string;
    document_types: string[];
    content_characteristics: {
      topics_covered: string[];
      content_language: string[];
      content_format: string[];
      update_frequency: string;
      coverage: string;
    };
  }
}

How Context Works

Context fields are included in embedding generation with specific weights:

// Level 0 embedding includes:
// - service_context.detailed_description (weight: 0.06)
// - service_context.use_cases (weight: 0.28) ← Highest weight
// - service_context.capabilities (weight: 0.18)
// - service_context.domain_context (weight: 0.20)
// - service_context.query_patterns (weight: varies)

// Level 1 embedding includes:
// - category_context.purpose (weight: 0.10)
// - category_context.typical_queries (weight: 0.25) ← Highest weight
// - category_context.keywords (weight: 0.22)

// Level 2 embedding includes:
// - collection_context.purpose (weight: 0.12)
// - collection_context.document_types (weight: 0.18)
// - collection_context.content_characteristics (weight: 0.20) ← Highest weight

Context Impact

Improves Matching: Rich context improves semantic matching
Weighted: Different context fields have different weights
Level-Specific: Context structure varies by level
Configurable: Can exclude context fields via ignore_fields

Weights

What Are Weights?

Weights determine how much each field contributes to the embedding. Higher weights mean the field has more influence on matching.

Weight Structure

Weights are configured in Qdrant (tenant/app/level-specific) with fallback to config file:

// Level 0 (Service) weights
{
  "Service": {
    "example_queries_weight": 0.28,      // Highest weight
    "domain_match_weight": 0.20,
    "capabilities_positive_weight": 0.18,
    "service_type_keywords_weight": 0.15,
    "description_weight": 0.10,
    "context_weight": 0.06,
    "capabilities_negative_weight": -0.18,  // Negative weight
    "name_weight": 0.03
  }
}

// Level 1 (Category) weights
{
  "Category": {
    "typical_queries_weight": 0.25,      // Highest weight
    "keywords_weight": 0.22,
    "domain_match_weight": 0.18,
    "example_queries_weight": 0.15,
    "purpose_weight": 0.10,
    "description_weight": 0.08,
    "context_weight": 0.04,
    "name_weight": 0.02
  }
}

// Level 2 (Collection) weights
{
  "Collection": {
    "topics_covered_weight": 0.25,       // Highest weight
    "content_characteristics_weight": 0.20,
    "document_types_weight": 0.18,
    "description_weight": 0.15,
    "purpose_weight": 0.12,
    "domain_match_weight": 0.08,
    "context_weight": 0.05,
    "name_weight": 0.02
  }
}

How Weights Work

Weight Resolution: Get weights for tenant/app/level (from Qdrant or config)
Text Extraction: Extract text from each field
Weight Application: Apply weights to each text segment
Embedding Generation: Generate embeddings with weighted texts
Combined Embedding: Create weighted combination of embeddings

Weight Flow

// Step 1: Resolve weights
const weightConfig = weightResolver.resolveWeights(node, parent, tenantId, appId, level);

// Step 2: Calculate weights for each field
const result = embeddingGenerator.calculateWeights(node, weightConfig);
// Returns: { texts: string[], weights: number[] }

// Step 3: Generate embeddings for each text
const embeddings = await embeddingService.batchEmbed(result.texts, 'document');

// Step 4: Create weighted combination
const combinedEmbedding = embeddingGenerator.createWeightedEmbedding(
  embeddings,
  result.weights
);

Weight Impact

Field Importance: Higher weights = more influence on matching
Negative Weights: Can reduce similarity (e.g., capabilities_negative_weight)
Configurable: Weights can be overridden per tenant/app/level
Dynamic: Weights are resolved at runtime

Example

// Node: Document Search Service
// Fields and weights:
// - use_cases: 0.28 (highest - most important)
// - domain_context: 0.20
// - capabilities: 0.18
// - description: 0.10
// - name: 0.03 (lowest - least important)

// Query: "How do I troubleshoot network issues?"
// Matching:
// - use_cases: High match (0.28 weight) → Strong contribution
// - domain_context: Medium match (0.20 weight) → Medium contribution
// - description: Low match (0.10 weight) → Weak contribution
// Result: High overall similarity due to use_cases match

Confidence Scores

What Is Confidence?

Confidence is a score (0.0 to 1.0) that indicates how confident the system is that a route is correct.

How Confidence Is Calculated

Confidence is calculated based on:

Similarity Score: Base similarity from semantic matching
High Confidence Threshold: Minimum similarity for high confidence
Clear Gap: Difference between top and second match
Clear Winner Bonus: Bonus for clear winner
Candidate Count: Fewer candidates = higher confidence

Confidence Calculation

function calculateConfidence(
  results: SemanticMatchResult[],
  level?: number,
  tenantId?: string,
  appId?: string
): number {
  if (results.length === 0) return 0;
  
  const topScore = results[0].weighted_score;
  const confidenceConfig = configLoader.getConfidenceConfig(tenantId, appId, level);
  
  const highConfidenceThreshold = confidenceConfig.confidence_calculation.high_confidence_threshold;
  const clearGapThreshold = confidenceConfig.confidence_calculation.clear_gap_threshold;
  const clearWinnerBonus = confidenceConfig.confidence_calculation.clear_winner_bonus;
  
  // Base confidence is the similarity score
  let confidence = topScore;
  
  // Apply confidence calculation if similarity exceeds threshold
  if (topScore > highConfidenceThreshold) {
    // Boost if fewer candidates (more focused matching)
    if (results.length <= 3) {
      confidence = Math.min(1.0, confidence + 0.05);
    } else if (results.length <= 5) {
      confidence = Math.min(1.0, confidence + 0.02);
    }
    
    // Apply clear winner bonus if single candidate
    if (results.length === 1) {
      confidence = Math.min(1.0, confidence + clearWinnerBonus);
    }
    
    // Check gap between top and second
    if (results.length > 1) {
      const secondScore = results[1].weighted_score;
      const gap = topScore - secondScore;
      
      if (gap > clearGapThreshold) {
        confidence = Math.min(1.0, confidence + clearWinnerBonus);
      }
    }
  }
  
  return confidence;
}

Confidence Configuration

Confidence thresholds are configurable per tenant/app/level:

{
  confidence_calculation: {
    high_confidence_threshold: 0.7,    // Minimum for high confidence
    clear_gap_threshold: 0.15,          // Gap between top and second
    clear_winner_bonus: 0.1            // Bonus for clear winner
  },
  confidence_thresholds: {
    level_0: 0.2,  // Minimum confidence for Level 0
    level_1: 0.3,  // Minimum confidence for Level 1
    level_2: 0.4   // Minimum confidence for Level 2
  }
}

Confidence Impact

Routing Decision: Routes below threshold are rejected
Level-Specific: Different thresholds for different levels
Configurable: Thresholds can be overridden per tenant/app/level
Dynamic: Confidence can degrade over time (cache)

Example

// Query: "How do I troubleshoot network issues?"
// Matches:
// 1. Network Troubleshooting: 0.90 (top)
// 2. Login Troubleshooting: 0.60 (second)
// Gap: 0.30 (> 0.15 threshold)
// High confidence threshold: 0.70 (< 0.90)
// Clear winner bonus: +0.1
// Result: confidence = 0.90 + 0.1 = 1.0 (capped at 1.0)

Complete Routing Example

Query: "How do I troubleshoot network issues?"

Step 1: Query Embedding

// Original query
query = "How do I troubleshoot network issues?"

// Extract synonyms from hierarchy
synonyms = {
  "how do i": "how to how can i how should i",
  "troubleshoot": "troubleshooting fix resolve debug diagnose repair"
}

// Expand query
expandedQuery = "How do I troubleshoot network issues? how to how can i how should i troubleshooting fix resolve debug diagnose repair network issues?"

// Generate embedding
queryEmbedding = await embeddingService.embed({
  text: expandedQuery,
  text_type: 'query',
  synonyms: synonyms
});

Step 2: Root Node Matching

// Match against root nodes
matches = [
  { node: documentSearchService, similarity: 0.75 },
  { node: sqlDatabaseService, similarity: 0.30 },
  { node: restApiService, similarity: 0.25 }
];

// Apply keyword boost
// Query contains "troubleshoot" → boost keyword for document-search-service
documentSearchService.similarity += 0.2; // 0.75 → 0.95

// Apply query intent boost
// Intent: documentation
// document-search-service: +0.25
documentSearchService.similarity += 0.25; // 0.95 → 1.0 (capped)

// Re-embed with service synonyms
serviceSynonyms = extractSynonymsFromNode(documentSearchService);
serviceEmbedding = await embeddingService.embed({
  text: query,
  synonyms: serviceSynonyms
});
// Re-calculate similarity: 0.98

// Select best match
selectedRoot = documentSearchService; // similarity: 0.98

Step 3: Category Matching

// Get children of document-search-service
categories = [
  troubleshootingCategory,
  policyDocumentsCategory
];

// Match against categories
categoryMatches = [
  { node: troubleshootingCategory, similarity: 0.85 },
  { node: policyDocumentsCategory, similarity: 0.40 }
];

// Select best match
selectedCategory = troubleshootingCategory; // similarity: 0.85

Step 4: Collection Matching

// Get children of troubleshooting-category
collections = [
  networkTroubleshooting,
  loginTroubleshooting
];

// Match against collections
collectionMatches = [
  { node: networkTroubleshooting, similarity: 0.90 },
  { node: loginTroubleshooting, similarity: 0.60 }
];

// Calculate confidence
confidence = calculateConfidence(collectionMatches);
// topScore: 0.90
// gap: 0.30 (> 0.15 threshold)
// clearWinnerBonus: +0.1
// confidence: 0.90 + 0.1 = 1.0

// Select best match
selectedCollection = networkTroubleshooting; // similarity: 0.90, confidence: 1.0

Step 5: Return Route

return {
  routes: [{
    source_id: "network-troubleshooting",
    source_type: "vector_db",
    service_type: "SEARCH",
    source_name: "Network Troubleshooting",
    connection_info: { ... },
    confidence_score: 1.0,
    ranking_score: 0.90
  }],
  primary_route: { ... },
  confidence: 1.0,
  decision_path: "HIERARCHY_TRAVERSAL"
};

Summary

Synonyms

Purpose: Expand queries to improve matching
Location: Node level (query_synonyms)
Impact: 15-25% confidence boost
Flow: Extract → Merge → Expand → Re-embed

Keywords

Purpose: Boost/penalty based on query content
Location: service_context.routing_keywords (Level 0)
Impact: ±0.2 to ±0.3 similarity adjustment
Flow: Detect → Boost/Penalty → Cross-service penalty

Query Intent

Purpose: Boost/penalty based on query intent
Location: service_context.query_intent_boosts (Level 0)
Impact: ±0.1 to ±0.25 similarity adjustment
Flow: Detect → Override → Apply boost

Context

Purpose: Rich information for semantic matching
Location: Level-specific context fields
Impact: Improves embedding quality
Flow: Extract → Weight → Embed

Weights

Purpose: Control field importance in embeddings
Location: Qdrant config (tenant/app/level)
Impact: Determines which fields matter most
Flow: Resolve → Apply → Combine

Confidence Scores

Purpose: Indicate route correctness
Location: Calculated from similarity
Impact: Determines if route is accepted
Flow: Calculate → Threshold → Decision

Best Practices

Synonyms: Add comprehensive synonyms for common query variations
Keywords: Use boost keywords for service-specific terms, penalty keywords for irrelevant terms
- See Configuration Guide - Routing Keywords for detailed instructions
Query Intent: Set appropriate boosts/penalties for each intent type
Context: Provide rich context in descriptions and use cases
Weights: Adjust weights based on which fields are most important
- See Configuration Guide - Embedding Weights for detailed instructions

Configuration Guide - How to update keywords, weights, and confidence settings
Test Script Guide - How to test routing changes
Hierarchy Data Guide - Hierarchy structure details

Routing Mechanics Guide

Overview

Routing Flow

Synonyms

What Are Synonyms?

Synonym Structure

How Synonyms Work

Synonym Flow

Synonym Impact

Example

Keywords

What Are Keywords?

Keyword Structure

How Keywords Work

Keyword Flow

Keyword Impact

Example

Query Intent

What Is Query Intent?

Query Intent Types

How Query Intent Works

Query Intent Detection

Query Intent Boosts

Query Intent Structure

Query Intent Impact

Example

Context

What Is Context?

Context Fields

Service Context (Level 0)

Category Context (Level 1)

Collection Context (Level 2)

How Context Works

Context Impact

Weights

What Are Weights?

Weight Structure

How Weights Work

Weight Flow

Weight Impact

Example

Confidence Scores

What Is Confidence?

How Confidence Is Calculated

Confidence Calculation

Confidence Configuration

Confidence Impact

Example

Complete Routing Example

Query: "How do I troubleshoot network issues?"

Step 1: Query Embedding

Step 2: Root Node Matching

Step 3: Category Matching

Step 4: Collection Matching

Step 5: Return Route

Summary

Synonyms

Keywords

Query Intent

Context

Weights

Confidence Scores

Best Practices

Related Documentation