LLM Providers

Configure any OpenAI-compatible LLM for AI investigations. Cloud or local - your choice.

Supported Providers

Provider	Type	API Key Required	Notes
OpenAI	Cloud	Yes	GPT-4o, GPT-4o-mini
Azure OpenAI	Cloud	Yes	Enterprise deployments
AWS Bedrock	Cloud	Yes	Claude, Llama via AWS
Together AI	Cloud	Yes	Open-source models
Ollama	Local	No	Run locally, air-gapped
vLLM	Local	No	High-performance serving
LM Studio	Local	No	Desktop app
LocalAI	Local	No	Drop-in OpenAI replacement
llama.cpp	Local	No	Server mode
TGI (HuggingFace)	Local	No	Text Generation Inference

Any provider with an OpenAI-compatible /v1/chat/completions endpoint will work.

Default: AI Relay (No Config)

If you don't set LLM_ENDPOINT, ReductrAI uses the managed AI Relay. This is included in your subscription - no API keys needed.

        # Just set your license key - AI Relay is automatic

        REDUCTRAI_LICENSE=RF-xxx-xxx

Cloud Providers

OpenAI

        LLM_ENDPOINT=https://api.openai.com/v1

        LLM_API_KEY=sk-xxx

        LLM_MODEL=gpt-4o-mini

Azure OpenAI

        LLM_ENDPOINT=https://YOUR-RESOURCE.openai.azure.com/openai/deployments/YOUR-DEPLOYMENT

        LLM_API_KEY=xxx

        LLM_MODEL=gpt-4

Together AI

        LLM_ENDPOINT=https://api.together.xyz/v1

        LLM_API_KEY=xxx

        LLM_MODEL=meta-llama/Llama-3-70b-chat-hf

Local Providers (Air-Gapped)

Ollama

Perfect for air-gapped deployments. Install Ollama, pull a model, and point ReductrAI to it.

        # Install Ollama

        curl -fsSL https://ollama.com/install.sh | sh

        # Pull a model

        ollama pull llama3.2

        # Configure ReductrAI

        LLM_ENDPOINT=http://localhost:11434/v1

        LLM_MODEL=llama3.2

vLLM

High-performance model serving for production deployments.

        # Start vLLM server

        python -m vllm.entrypoints.openai.api_server \

          --model meta-llama/Llama-3-8b-chat-hf \

          --port 8000

        # Configure ReductrAI

        LLM_ENDPOINT=http://localhost:8000/v1

        LLM_MODEL=meta-llama/Llama-3-8b-chat-hf

LM Studio

Desktop app with a built-in server mode.

        # Start LM Studio's local server (in the app)

        # Default port: 1234

        LLM_ENDPOINT=http://localhost:1234/v1

        LLM_MODEL=local-model

LocalAI

        # Run LocalAI container

        docker run -p 8080:8080 localai/localai

        LLM_ENDPOINT=http://localhost:8080/v1

        LLM_MODEL=gpt-3.5-turbo

Model Recommendations

For incident investigation, we recommend models with strong reasoning capabilities:

Use Case	Recommended Models
Local / air-gapped (recommended)	Ollama (Llama 3.2, Mistral 7B, Qwen 2.5)
Self-hosted high-throughput	vLLM or LocalAI with any open-weight model
Managed cloud	GPT-4o, Claude 3.5, Llama via Bedrock

Verifying Configuration

        # Check that ReductrAI can reach your LLM

        reductrai status

        # Look for "LLM: connected" in output