LLM Providers

Configure any OpenAI-compatible LLM for AI investigations. Cloud or local - your choice.

Supported Providers

ProviderTypeAPI Key RequiredNotes
OpenAICloudYesGPT-4o, GPT-4o-mini
Azure OpenAICloudYesEnterprise deployments
AWS BedrockCloudYesClaude, Llama via AWS
Together AICloudYesOpen-source models
OllamaLocalNoRun locally, air-gapped
vLLMLocalNoHigh-performance serving
LM StudioLocalNoDesktop app
LocalAILocalNoDrop-in OpenAI replacement
llama.cppLocalNoServer mode
TGI (HuggingFace)LocalNoText Generation Inference

Any provider with an OpenAI-compatible /v1/chat/completions endpoint will work.

Default: AI Relay (No Config)

If you don't set LLM_ENDPOINT, ReductrAI uses the managed AI Relay. This is included in your subscription - no API keys needed.

# Just set your license key - AI Relay is automatic
REDUCTRAI_LICENSE=RF-xxx-xxx

Cloud Providers

OpenAI

LLM_ENDPOINT=https://api.openai.com/v1
LLM_API_KEY=sk-xxx
LLM_MODEL=gpt-4o-mini

Azure OpenAI

LLM_ENDPOINT=https://YOUR-RESOURCE.openai.azure.com/openai/deployments/YOUR-DEPLOYMENT
LLM_API_KEY=xxx
LLM_MODEL=gpt-4

Together AI

LLM_ENDPOINT=https://api.together.xyz/v1
LLM_API_KEY=xxx
LLM_MODEL=meta-llama/Llama-3-70b-chat-hf

Local Providers (Air-Gapped)

Ollama

Perfect for air-gapped deployments. Install Ollama, pull a model, and point ReductrAI to it.

# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh

# Pull a model
ollama pull llama3.2

# Configure ReductrAI
LLM_ENDPOINT=http://localhost:11434/v1
LLM_MODEL=llama3.2

vLLM

High-performance model serving for production deployments.

# Start vLLM server
python -m vllm.entrypoints.openai.api_server \
  --model meta-llama/Llama-3-8b-chat-hf \
  --port 8000

# Configure ReductrAI
LLM_ENDPOINT=http://localhost:8000/v1
LLM_MODEL=meta-llama/Llama-3-8b-chat-hf

LM Studio

Desktop app with a built-in server mode.

# Start LM Studio's local server (in the app)
# Default port: 1234

LLM_ENDPOINT=http://localhost:1234/v1
LLM_MODEL=local-model

LocalAI

# Run LocalAI container
docker run -p 8080:8080 localai/localai

LLM_ENDPOINT=http://localhost:8080/v1
LLM_MODEL=gpt-3.5-turbo

Model Recommendations

For incident investigation, we recommend models with strong reasoning capabilities:

Use CaseRecommended Models
Local / air-gapped (recommended)Ollama (Llama 3.2, Mistral 7B, Qwen 2.5)
Self-hosted high-throughputvLLM or LocalAI with any open-weight model
Managed cloudGPT-4o, Claude 3.5, Llama via Bedrock

Verifying Configuration

# Check that ReductrAI can reach your LLM
reductrai status

# Look for "LLM: connected" in output