Logo

Multi-LLM Provider Setup Guide

Multi-LLM Provider Setup Guide

Overview

The AI routing service now supports three LLM providers:

  1. Anthropic Claude (default)
  2. OpenAI (GPT-4, GPT-3.5)
  3. Ollama (local, free)

Quick Setup

Option 1: Anthropic Claude (Default)

# .env
AI_ROUTING_ENABLED=true
AI_ROUTING_PROVIDER=anthropic
ANTHROPIC_API_KEY=sk-ant-...
AI_ROUTING_MODEL=claude-3-5-sonnet-20241022

Option 2: OpenAI

# .env
AI_ROUTING_ENABLED=true
AI_ROUTING_PROVIDER=openai
OPENAI_API_KEY=sk-...
AI_ROUTING_MODEL=gpt-4-turbo-preview

Option 3: Ollama (Local, Free)

# .env
AI_ROUTING_ENABLED=true
AI_ROUTING_PROVIDER=ollama
OLLAMA_BASE_URL=http://localhost:11434  # Optional
AI_ROUTING_MODEL=llama2

First, install Ollama:

# macOS/Linux
curl -fsSL https://ollama.ai/install.sh | sh

# Or download from https://ollama.ai

# Pull a model
ollama pull llama2

Provider Comparison

ProviderCostSpeedSetupPrivacyBest For
Anthropic$$FastEasyCloudProduction
OpenAI$$-$$$FastEasyCloudProduction
OllamaFreeMediumMediumLocalDevelopment, Privacy

Environment Variables

Required (Provider-Specific)

Anthropic:

ANTHROPIC_API_KEY=sk-ant-...

OpenAI:

OPENAI_API_KEY=sk-...

Ollama:

# No API key needed!
# Just install Ollama and pull a model

Optional

# Provider selection (default: anthropic)
AI_ROUTING_PROVIDER=anthropic  # or 'openai' or 'ollama'

# Model selection (provider-specific defaults)
AI_ROUTING_MODEL=claude-3-5-sonnet-20241022  # Anthropic
AI_ROUTING_MODEL=gpt-4-turbo-preview         # OpenAI
AI_ROUTING_MODEL=llama2                      # Ollama

# Ollama base URL (default: http://localhost:11434)
OLLAMA_BASE_URL=http://localhost:11434

# Timeout (default: 10000ms)
AI_ROUTING_TIMEOUT_MS=10000

Model Selection Guide

Anthropic Models

ModelCostSpeedUse Case
claude-3-5-sonnet-20241022$$FastRecommended - Best balance
claude-3-opus-20240229$$$MediumHighest quality
claude-3-haiku-20240307$Very FastFast, cheap

OpenAI Models

ModelCostSpeedUse Case
gpt-4-turbo-preview$$$FastRecommended - Best quality
gpt-4$$$FastHigh quality
gpt-3.5-turbo$Very FastFast, cheap

Ollama Models

ModelSizeSpeedUse Case
llama27BMediumRecommended - Good balance
llama2:13b13BSlowBetter accuracy
mistral7BFastFast, efficient
codellama7BMediumCode-focused

Pull models:

ollama pull llama2
ollama pull mistral
ollama pull codellama

Auto-Provider Selection

The system automatically selects an available provider if the requested one is not configured:

// Priority order:
1. Requested provider (AI_ROUTING_PROVIDER)
2. Anthropic (if ANTHROPIC_API_KEY set)
3. OpenAI (if OPENAI_API_KEY set)
4. Ollama (always available, but may fail if not running)

Testing Providers

Test Anthropic

curl -X POST http://localhost:3000/api/v1/route/ai \
  -H "Content-Type: application/json" \
  -d '{
    "tenant_id": "tenant-001",
    "app_id": "app-001",
    "roles": ["developer"],
    "query": "Show network device inventory"
  }'

Test OpenAI

# Set provider
export AI_ROUTING_PROVIDER=openai
export OPENAI_API_KEY=sk-...

# Restart service
npm run dev

# Same API call
curl -X POST http://localhost:3000/api/v1/route/ai ...

Test Ollama

# Start Ollama (if not running)
ollama serve

# Pull model (if not already pulled)
ollama pull llama2

# Set provider
export AI_ROUTING_PROVIDER=ollama

# Restart service
npm run dev

# Same API call
curl -X POST http://localhost:3000/api/v1/route/ai ...

Troubleshooting

Issue: "No LLM providers are configured"

Solution: Configure at least one provider:

  • Set ANTHROPIC_API_KEY OR
  • Set OPENAI_API_KEY OR
  • Install and run Ollama

Issue: "Ollama API error: Connection refused"

Solution:

  1. Make sure Ollama is running: ollama serve
  2. Check OLLAMA_BASE_URL is correct
  3. Verify Ollama is accessible: curl http://localhost:11434/api/tags

Issue: "Ollama model not found"

Solution: Pull the model first:

ollama pull llama2

Issue: Slow responses with Ollama

Solutions:

  • Use smaller model (llama2:7b instead of llama2:13b)
  • Use GPU-accelerated Ollama
  • Increase timeout: AI_ROUTING_TIMEOUT_MS=30000

Cost Optimization

Development

  • Use Ollama (free, local)
  • No API costs
  • Privacy for sensitive data

Production (Low Volume)

  • Use Claude Haiku or GPT-3.5 (cheaper)
  • ~$0.01 per query

Production (High Volume)

  • Use deterministic routing (/api/v1/route) for clear queries
  • Use AI routing only for ambiguous queries
  • Consider caching LLM responses

Provider-Specific Notes

Anthropic

  • Excellent reasoning capabilities
  • Good for Chain of Thought
  • Fast response times
  • Requires API key

OpenAI

  • Widely available
  • Good performance
  • Multiple model options
  • Requires API key

Ollama

  • Free (runs locally)
  • Privacy (data stays local)
  • No rate limits
  • Requires local setup
  • Performance depends on hardware
  • Best for development/testing

Switching Providers

You can switch providers without code changes:

# Switch to OpenAI
export AI_ROUTING_PROVIDER=openai
export OPENAI_API_KEY=sk-...
npm run dev

# Switch to Ollama
export AI_ROUTING_PROVIDER=ollama
npm run dev

# Switch back to Anthropic
export AI_ROUTING_PROVIDER=anthropic
npm run dev

The system automatically detects available providers and uses the best one.


Next Steps

© 2025 All rights reservedBuilt with DataHub Cloud

Built with LogoDataHub Cloud