Logo

Routing Mechanics Guide

Routing Mechanics Guide

Overview

This guide explains how the Haya Routing Service determines the best route for a user query. It covers:

  • Synonyms: How query expansion works
  • Keywords: How keyword-based routing works
  • Query Intent: How intent detection and boosting works
  • Context: How context fields impact matching
  • Weights: How field weights affect embeddings
  • Confidence Scores: How confidence is calculated

Routing Flow

User Query
1. Generate Query Embedding (with synonyms)
2. Match Root Nodes (Level 0)
    ├─ Semantic Similarity
    ├─ Keyword Boosts/Penalties
    ├─ Query Intent Boosts
    └─ Synonym Re-embedding
3. Select Best Root Node
4. Traverse to Children (Level 1)
    ├─ Semantic Matching
    └─ Access Control
5. Traverse to Collections (Level 2)
    ├─ Semantic Matching
    ├─ Access Control
    └─ Final Route Selection
6. Return Route with Confidence Score

Synonyms

What Are Synonyms?

Synonyms are word/phrase mappings that expand queries to improve semantic matching. They're stored at the node level (not in context) and are level-based.

Synonym Structure

{
  query_synonyms: {
    "show me": "find search locate get retrieve display list",
    "how do i": "how to how can i how should i",
    "troubleshoot": "troubleshooting fix resolve debug diagnose repair",
    "documentation": "docs documents files records"
  }
}

How Synonyms Work

  1. Extraction: Synonyms are extracted from nodes in the hierarchy path
  2. Merging: Synonyms from all nodes in the path are merged
  3. Query Expansion: Query is expanded with synonyms before embedding
  4. Re-embedding: Query is re-embedded with service-specific synonyms for better matching

Synonym Flow

// Step 1: Extract synonyms from hierarchy path
const synonyms = mergeSynonymsFromPath(hierarchyPath);

// Step 2: Expand query
const expandedQuery = expandQuery(query, synonyms);
// "show me servers" → "show me servers find search locate get retrieve display list servers"

// Step 3: Generate embedding with expanded query
const embedding = await embeddingService.embed({
  text: expandedQuery,
  text_type: 'query',
  synonyms: synonyms
});

// Step 4: Re-embed with service-specific synonyms for better matching
const serviceSynonyms = extractSynonymsFromNode(rootNode);
const serviceEmbedding = await embeddingService.embed({
  text: query,
  text_type: 'query',
  synonyms: serviceSynonyms
});

Synonym Impact

  • Improves Matching: Expands query to match more variations
  • Context-Aware: Uses synonyms from the hierarchy path
  • Service-Specific: Re-embeds with service synonyms for better root node matching
  • Confidence Boost: Can increase confidence by 15-25%

Example

// Query: "show me servers"
// Synonyms: { "show me": "find search locate get retrieve display list" }
// Expanded: "show me servers find search locate get retrieve display list servers"
// Result: Better match with "List all servers" in node description

Keywords

What Are Keywords?

Keywords are boost/penalty terms that adjust similarity scores based on query content. They're stored in service_context.routing_keywords (Level 0 only).

Keyword Structure

{
  routing_keywords: {
    boost_keywords: ["search", "find", "documentation", "docs"],
    penalty_keywords: ["api", "endpoint", "sql", "database"],
    boost_value: 0.2,      // Default: 0.3
    penalty_value: -0.15    // Default: -0.2
  }
}

How Keywords Work

  1. Keyword Detection: Check if query contains boost/penalty keywords
  2. Boost Application: Add boost_value to similarity if boost keywords found
  3. Penalty Application: Subtract penalty_value from similarity if penalty keywords found
  4. Cross-Service Penalty: Apply penalty if other services' boost keywords are present

Keyword Flow

// Step 1: Get routing keywords from service context
const routingKeywords = serviceContext?.routing_keywords;

// Step 2: Check for boost keywords
const hasBoostKeywords = boostKeywords.some(keyword => 
  queryLower.includes(keyword.toLowerCase())
);

// Step 3: Apply boost
if (hasBoostKeywords) {
  similarity = Math.min(1.0, similarity + boostValue);
}

// Step 4: Check for penalty keywords
const hasPenaltyKeywords = penaltyKeywords.some(keyword => 
  queryLower.includes(keyword.toLowerCase())
);

// Step 5: Apply penalty
if (hasPenaltyKeywords) {
  similarity = Math.max(0, similarity + penaltyValue);
}

// Step 6: Cross-service penalty
// If other service's boost keywords are present, apply penalty
for (const otherService of otherServices) {
  if (otherService.boostKeywords.some(k => query.includes(k))) {
    if (!currentService.boostKeywords.includes(k)) {
      similarity = Math.max(0, similarity + penaltyValue);
    }
  }
}

Keyword Impact

  • Boosts Matching: Increases similarity when relevant keywords are found
  • Prevents Mismatches: Decreases similarity when irrelevant keywords are found
  • Cross-Service Logic: Prevents SQL from matching when REST/MCP keywords are present
  • Configurable: Boost/penalty values are configurable per service

Example

// Query: "search for documentation"
// Service: Document Search Service
// Boost Keywords: ["search", "documentation"]
// Result: similarity += 0.2 (boost applied)

// Query: "query database using sql"
// Service: Document Search Service
// Penalty Keywords: ["sql", "database", "query"]
// Result: similarity -= 0.15 (penalty applied)

Query Intent

What Is Query Intent?

Query intent is the detected purpose of a query. It's used to apply boosts/penalties based on the query's intent.

Query Intent Types

  • documentation: User wants documentation/search results
  • data_query: User wants to query a database
  • api_call: User wants to call a REST API
  • mcp_config: User wants MCP protocol configuration

How Query Intent Works

  1. Intent Detection: Analyze query to detect intent
  2. Intent Boosts: Apply boosts/penalties based on intent and service type
  3. Hierarchy Override: Check for per-service overrides in hierarchy data
  4. Config Fallback: Fall back to config file if no hierarchy override

Query Intent Detection

// Intent detection logic
function detectQueryIntent(query: string): QueryIntent {
  const queryLower = query.toLowerCase();
  
  // Documentation intent
  if (queryLower.includes('documentation') || 
      queryLower.includes('docs') ||
      queryLower.includes('guide') ||
      queryLower.includes('how to')) {
    return 'documentation';
  }
  
  // Data query intent
  if (queryLower.includes('list') ||
      queryLower.includes('count') ||
      queryLower.includes('select') ||
      queryLower.includes('query database')) {
    return 'data_query';
  }
  
  // API call intent
  if (queryLower.includes('api') ||
      queryLower.includes('endpoint') ||
      queryLower.includes('call')) {
    return 'api_call';
  }
  
  // MCP config intent
  if (queryLower.includes('mcp') ||
      queryLower.includes('model context') ||
      queryLower.includes('workflow')) {
    return 'mcp_config';
  }
  
  return 'documentation'; // Default
}

Query Intent Boosts

Query intent boosts are applied after keyword boosts:

// Step 1: Try to get from hierarchy data (per-service override)
const serviceContext = getFieldAtLevel(rootNode, 'service_context');
const queryIntentBoosts = serviceContext?.query_intent_boosts;

if (queryIntentBoosts && queryIntent in queryIntentBoosts) {
  intentBoostValue = queryIntentBoosts[queryIntent];
  boostSource = 'hierarchy_data';
}

// Step 2: Fallback to config file
if (intentBoostValue === undefined) {
  const boostConfig = configLoader.getBoostPenaltyConfig();
  intentBoostValue = boostConfig.service_type_boosts.service_level[serviceType]?.[queryIntent];
  boostSource = 'config_file';
}

// Step 3: Apply boost
if (intentBoostValue !== undefined) {
  similarity = Math.min(1.0, similarity + intentBoostValue);
}

Query Intent Structure

// In hierarchy data (Level 0)
{
  service_context: {
    query_intent_boosts: {
      documentation: 0.25,    // Boost for documentation queries
      data_query: -0.1,       // Penalty for data queries
      api_call: -0.15,        // Penalty for API calls
      mcp_config: 0.0         // Neutral for MCP config
    }
  }
}

Query Intent Impact

  • Improves Accuracy: Boosts relevant services, penalizes irrelevant ones
  • Service-Specific: Can override defaults per service
  • Configurable: Can be set in hierarchy data or config file
  • Applied After Keywords: Intent boosts are applied after keyword boosts

Example

// Query: "How do I troubleshoot network issues?"
// Intent: documentation
// Service: Document Search Service (SEARCH)
// Intent Boost: +0.25 (from hierarchy data)
// Result: similarity += 0.25

// Query: "List all servers"
// Intent: data_query
// Service: Document Search Service (SEARCH)
// Intent Boost: -0.1 (from hierarchy data)
// Result: similarity -= 0.1 (penalty)

Context

What Is Context?

Context is additional information about a node that helps with semantic matching. It's stored in level-specific context fields:

  • Level 0: service_context
  • Level 1: category_context
  • Level 2: collection_context

Context Fields

Service Context (Level 0)

{
  service_context: {
    detailed_description: string;
    use_cases: Array<{
      scenario: string;
      example_queries: string[];
      success_criteria?: string;
    }>;
    capabilities: {
      what_it_can_do: string[];
      what_it_cannot_do: string[];
      strengths: string[];
      limitations: string[];
    };
    domain_context: {
      primary_domains: string[];
      categories: string[];
      subcategories: string[];
      business_context: string;
    };
    query_patterns: {
      good_matches: Array<{
        pattern: string;
        examples: string[];
        confidence_boost?: number;
      }>;
      poor_matches: Array<{
        pattern: string;
        examples: string[];
        reason: string;
      }>;
    };
    routing_keywords: { ... };
    query_intent_boosts: { ... };
  }
}

Category Context (Level 1)

{
  category_context: {
    purpose: string;
    description: string;
    typical_queries: string[];
    keywords: string[];
  }
}

Collection Context (Level 2)

{
  collection_context: {
    purpose: string;
    description: string;
    document_types: string[];
    content_characteristics: {
      topics_covered: string[];
      content_language: string[];
      content_format: string[];
      update_frequency: string;
      coverage: string;
    };
  }
}

How Context Works

Context fields are included in embedding generation with specific weights:

// Level 0 embedding includes:
// - service_context.detailed_description (weight: 0.06)
// - service_context.use_cases (weight: 0.28) ← Highest weight
// - service_context.capabilities (weight: 0.18)
// - service_context.domain_context (weight: 0.20)
// - service_context.query_patterns (weight: varies)

// Level 1 embedding includes:
// - category_context.purpose (weight: 0.10)
// - category_context.typical_queries (weight: 0.25) ← Highest weight
// - category_context.keywords (weight: 0.22)

// Level 2 embedding includes:
// - collection_context.purpose (weight: 0.12)
// - collection_context.document_types (weight: 0.18)
// - collection_context.content_characteristics (weight: 0.20) ← Highest weight

Context Impact

  • Improves Matching: Rich context improves semantic matching
  • Weighted: Different context fields have different weights
  • Level-Specific: Context structure varies by level
  • Configurable: Can exclude context fields via ignore_fields

Weights

What Are Weights?

Weights determine how much each field contributes to the embedding. Higher weights mean the field has more influence on matching.

Weight Structure

Weights are configured in Qdrant (tenant/app/level-specific) with fallback to config file:

// Level 0 (Service) weights
{
  "Service": {
    "example_queries_weight": 0.28,      // Highest weight
    "domain_match_weight": 0.20,
    "capabilities_positive_weight": 0.18,
    "service_type_keywords_weight": 0.15,
    "description_weight": 0.10,
    "context_weight": 0.06,
    "capabilities_negative_weight": -0.18,  // Negative weight
    "name_weight": 0.03
  }
}

// Level 1 (Category) weights
{
  "Category": {
    "typical_queries_weight": 0.25,      // Highest weight
    "keywords_weight": 0.22,
    "domain_match_weight": 0.18,
    "example_queries_weight": 0.15,
    "purpose_weight": 0.10,
    "description_weight": 0.08,
    "context_weight": 0.04,
    "name_weight": 0.02
  }
}

// Level 2 (Collection) weights
{
  "Collection": {
    "topics_covered_weight": 0.25,       // Highest weight
    "content_characteristics_weight": 0.20,
    "document_types_weight": 0.18,
    "description_weight": 0.15,
    "purpose_weight": 0.12,
    "domain_match_weight": 0.08,
    "context_weight": 0.05,
    "name_weight": 0.02
  }
}

How Weights Work

  1. Weight Resolution: Get weights for tenant/app/level (from Qdrant or config)
  2. Text Extraction: Extract text from each field
  3. Weight Application: Apply weights to each text segment
  4. Embedding Generation: Generate embeddings with weighted texts
  5. Combined Embedding: Create weighted combination of embeddings

Weight Flow

// Step 1: Resolve weights
const weightConfig = weightResolver.resolveWeights(node, parent, tenantId, appId, level);

// Step 2: Calculate weights for each field
const result = embeddingGenerator.calculateWeights(node, weightConfig);
// Returns: { texts: string[], weights: number[] }

// Step 3: Generate embeddings for each text
const embeddings = await embeddingService.batchEmbed(result.texts, 'document');

// Step 4: Create weighted combination
const combinedEmbedding = embeddingGenerator.createWeightedEmbedding(
  embeddings,
  result.weights
);

Weight Impact

  • Field Importance: Higher weights = more influence on matching
  • Negative Weights: Can reduce similarity (e.g., capabilities_negative_weight)
  • Configurable: Weights can be overridden per tenant/app/level
  • Dynamic: Weights are resolved at runtime

Example

// Node: Document Search Service
// Fields and weights:
// - use_cases: 0.28 (highest - most important)
// - domain_context: 0.20
// - capabilities: 0.18
// - description: 0.10
// - name: 0.03 (lowest - least important)

// Query: "How do I troubleshoot network issues?"
// Matching:
// - use_cases: High match (0.28 weight) → Strong contribution
// - domain_context: Medium match (0.20 weight) → Medium contribution
// - description: Low match (0.10 weight) → Weak contribution
// Result: High overall similarity due to use_cases match

Confidence Scores

What Is Confidence?

Confidence is a score (0.0 to 1.0) that indicates how confident the system is that a route is correct.

How Confidence Is Calculated

Confidence is calculated based on:

  1. Similarity Score: Base similarity from semantic matching
  2. High Confidence Threshold: Minimum similarity for high confidence
  3. Clear Gap: Difference between top and second match
  4. Clear Winner Bonus: Bonus for clear winner
  5. Candidate Count: Fewer candidates = higher confidence

Confidence Calculation

function calculateConfidence(
  results: SemanticMatchResult[],
  level?: number,
  tenantId?: string,
  appId?: string
): number {
  if (results.length === 0) return 0;
  
  const topScore = results[0].weighted_score;
  const confidenceConfig = configLoader.getConfidenceConfig(tenantId, appId, level);
  
  const highConfidenceThreshold = confidenceConfig.confidence_calculation.high_confidence_threshold;
  const clearGapThreshold = confidenceConfig.confidence_calculation.clear_gap_threshold;
  const clearWinnerBonus = confidenceConfig.confidence_calculation.clear_winner_bonus;
  
  // Base confidence is the similarity score
  let confidence = topScore;
  
  // Apply confidence calculation if similarity exceeds threshold
  if (topScore > highConfidenceThreshold) {
    // Boost if fewer candidates (more focused matching)
    if (results.length <= 3) {
      confidence = Math.min(1.0, confidence + 0.05);
    } else if (results.length <= 5) {
      confidence = Math.min(1.0, confidence + 0.02);
    }
    
    // Apply clear winner bonus if single candidate
    if (results.length === 1) {
      confidence = Math.min(1.0, confidence + clearWinnerBonus);
    }
    
    // Check gap between top and second
    if (results.length > 1) {
      const secondScore = results[1].weighted_score;
      const gap = topScore - secondScore;
      
      if (gap > clearGapThreshold) {
        confidence = Math.min(1.0, confidence + clearWinnerBonus);
      }
    }
  }
  
  return confidence;
}

Confidence Configuration

Confidence thresholds are configurable per tenant/app/level:

{
  confidence_calculation: {
    high_confidence_threshold: 0.7,    // Minimum for high confidence
    clear_gap_threshold: 0.15,          // Gap between top and second
    clear_winner_bonus: 0.1            // Bonus for clear winner
  },
  confidence_thresholds: {
    level_0: 0.2,  // Minimum confidence for Level 0
    level_1: 0.3,  // Minimum confidence for Level 1
    level_2: 0.4   // Minimum confidence for Level 2
  }
}

Confidence Impact

  • Routing Decision: Routes below threshold are rejected
  • Level-Specific: Different thresholds for different levels
  • Configurable: Thresholds can be overridden per tenant/app/level
  • Dynamic: Confidence can degrade over time (cache)

Example

// Query: "How do I troubleshoot network issues?"
// Matches:
// 1. Network Troubleshooting: 0.90 (top)
// 2. Login Troubleshooting: 0.60 (second)
// Gap: 0.30 (> 0.15 threshold)
// High confidence threshold: 0.70 (< 0.90)
// Clear winner bonus: +0.1
// Result: confidence = 0.90 + 0.1 = 1.0 (capped at 1.0)

Complete Routing Example

Query: "How do I troubleshoot network issues?"

Step 1: Query Embedding

// Original query
query = "How do I troubleshoot network issues?"

// Extract synonyms from hierarchy
synonyms = {
  "how do i": "how to how can i how should i",
  "troubleshoot": "troubleshooting fix resolve debug diagnose repair"
}

// Expand query
expandedQuery = "How do I troubleshoot network issues? how to how can i how should i troubleshooting fix resolve debug diagnose repair network issues?"

// Generate embedding
queryEmbedding = await embeddingService.embed({
  text: expandedQuery,
  text_type: 'query',
  synonyms: synonyms
});

Step 2: Root Node Matching

// Match against root nodes
matches = [
  { node: documentSearchService, similarity: 0.75 },
  { node: sqlDatabaseService, similarity: 0.30 },
  { node: restApiService, similarity: 0.25 }
];

// Apply keyword boost
// Query contains "troubleshoot" → boost keyword for document-search-service
documentSearchService.similarity += 0.2; // 0.75 → 0.95

// Apply query intent boost
// Intent: documentation
// document-search-service: +0.25
documentSearchService.similarity += 0.25; // 0.95 → 1.0 (capped)

// Re-embed with service synonyms
serviceSynonyms = extractSynonymsFromNode(documentSearchService);
serviceEmbedding = await embeddingService.embed({
  text: query,
  synonyms: serviceSynonyms
});
// Re-calculate similarity: 0.98

// Select best match
selectedRoot = documentSearchService; // similarity: 0.98

Step 3: Category Matching

// Get children of document-search-service
categories = [
  troubleshootingCategory,
  policyDocumentsCategory
];

// Match against categories
categoryMatches = [
  { node: troubleshootingCategory, similarity: 0.85 },
  { node: policyDocumentsCategory, similarity: 0.40 }
];

// Select best match
selectedCategory = troubleshootingCategory; // similarity: 0.85

Step 4: Collection Matching

// Get children of troubleshooting-category
collections = [
  networkTroubleshooting,
  loginTroubleshooting
];

// Match against collections
collectionMatches = [
  { node: networkTroubleshooting, similarity: 0.90 },
  { node: loginTroubleshooting, similarity: 0.60 }
];

// Calculate confidence
confidence = calculateConfidence(collectionMatches);
// topScore: 0.90
// gap: 0.30 (> 0.15 threshold)
// clearWinnerBonus: +0.1
// confidence: 0.90 + 0.1 = 1.0

// Select best match
selectedCollection = networkTroubleshooting; // similarity: 0.90, confidence: 1.0

Step 5: Return Route

return {
  routes: [{
    source_id: "network-troubleshooting",
    source_type: "vector_db",
    service_type: "SEARCH",
    source_name: "Network Troubleshooting",
    connection_info: { ... },
    confidence_score: 1.0,
    ranking_score: 0.90
  }],
  primary_route: { ... },
  confidence: 1.0,
  decision_path: "HIERARCHY_TRAVERSAL"
};

Summary

Synonyms

  • Purpose: Expand queries to improve matching
  • Location: Node level (query_synonyms)
  • Impact: 15-25% confidence boost
  • Flow: Extract → Merge → Expand → Re-embed

Keywords

  • Purpose: Boost/penalty based on query content
  • Location: service_context.routing_keywords (Level 0)
  • Impact: ±0.2 to ±0.3 similarity adjustment
  • Flow: Detect → Boost/Penalty → Cross-service penalty

Query Intent

  • Purpose: Boost/penalty based on query intent
  • Location: service_context.query_intent_boosts (Level 0)
  • Impact: ±0.1 to ±0.25 similarity adjustment
  • Flow: Detect → Override → Apply boost

Context

  • Purpose: Rich information for semantic matching
  • Location: Level-specific context fields
  • Impact: Improves embedding quality
  • Flow: Extract → Weight → Embed

Weights

  • Purpose: Control field importance in embeddings
  • Location: Qdrant config (tenant/app/level)
  • Impact: Determines which fields matter most
  • Flow: Resolve → Apply → Combine

Confidence Scores

  • Purpose: Indicate route correctness
  • Location: Calculated from similarity
  • Impact: Determines if route is accepted
  • Flow: Calculate → Threshold → Decision

Best Practices

  1. Synonyms: Add comprehensive synonyms for common query variations
  2. Keywords: Use boost keywords for service-specific terms, penalty keywords for irrelevant terms
  3. Query Intent: Set appropriate boosts/penalties for each intent type
  4. Context: Provide rich context in descriptions and use cases
  5. Weights: Adjust weights based on which fields are most important

© 2025 All rights reservedBuilt with DataHub Cloud

Built with LogoDataHub Cloud