LLM Providers
Configure any OpenAI-compatible LLM for AI investigations. Cloud or local - your choice.
Supported Providers
| Provider | Type | API Key Required | Notes |
|---|---|---|---|
| OpenAI | Cloud | Yes | GPT-4o, GPT-4o-mini |
| Azure OpenAI | Cloud | Yes | Enterprise deployments |
| AWS Bedrock | Cloud | Yes | Claude, Llama via AWS |
| BitNet | Local | No | Zero-cost 1-bit LLM (recommended) |
| Ollama | Local | No | Run locally, air-gapped |
| vLLM | Local | No | High-performance serving |
| LM Studio | Local | No | Desktop app |
| LocalAI | Local | No | Drop-in OpenAI replacement |
| llama.cpp | Local | No | Server mode |
| TGI (HuggingFace) | Local | No | Text Generation Inference |
Any provider with an OpenAI-compatible /v1/chat/completions endpoint will work.
Default: Local LLM (Ollama)
ReductrAI uses local LLM by default (Ollama at localhost:11434). No external AI services required - full privacy.
# Start Ollama and pull a model
ollama serve && ollama pull llama3.2
# Or use BitNet for zero-cost inference
make build-bitnet && ./dist/bitnet-server
ollama serve && ollama pull llama3.2
# Or use BitNet for zero-cost inference
make build-bitnet && ./dist/bitnet-server
Cloud Providers
OpenAI
LLM_ENDPOINT=https://api.openai.com/v1
LLM_API_KEY=sk-xxx
LLM_MODEL=gpt-4o-mini
LLM_API_KEY=sk-xxx
LLM_MODEL=gpt-4o-mini
Azure OpenAI
LLM_ENDPOINT=https://YOUR-RESOURCE.openai.azure.com/openai/deployments/YOUR-DEPLOYMENT
LLM_API_KEY=xxx
LLM_MODEL=gpt-4
LLM_API_KEY=xxx
LLM_MODEL=gpt-4
Local Providers (Recommended)
Ollama
Perfect for air-gapped deployments. Install Ollama, pull a model, and point ReductrAI to it.
# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh
# Pull a model
ollama pull llama3.2
# Configure ReductrAI
LLM_ENDPOINT=http://localhost:11434/v1
LLM_MODEL=llama3.2
curl -fsSL https://ollama.com/install.sh | sh
# Pull a model
ollama pull llama3.2
# Configure ReductrAI
LLM_ENDPOINT=http://localhost:11434/v1
LLM_MODEL=llama3.2
vLLM
High-performance model serving for production deployments.
# Start vLLM server
python -m vllm.entrypoints.openai.api_server \
--model meta-llama/Llama-3-8b-chat-hf \
--port 8000
# Configure ReductrAI
LLM_ENDPOINT=http://localhost:8000/v1
LLM_MODEL=meta-llama/Llama-3-8b-chat-hf
python -m vllm.entrypoints.openai.api_server \
--model meta-llama/Llama-3-8b-chat-hf \
--port 8000
# Configure ReductrAI
LLM_ENDPOINT=http://localhost:8000/v1
LLM_MODEL=meta-llama/Llama-3-8b-chat-hf
LM Studio
Desktop app with a built-in server mode.
# Start LM Studio's local server (in the app)
# Default port: 1234
LLM_ENDPOINT=http://localhost:1234/v1
LLM_MODEL=local-model
# Default port: 1234
LLM_ENDPOINT=http://localhost:1234/v1
LLM_MODEL=local-model
LocalAI
# Run LocalAI container
docker run -p 8080:8080 localai/localai
LLM_ENDPOINT=http://localhost:8080/v1
LLM_MODEL=gpt-3.5-turbo
docker run -p 8080:8080 localai/localai
LLM_ENDPOINT=http://localhost:8080/v1
LLM_MODEL=gpt-3.5-turbo
Model Recommendations
For incident investigation, we recommend models with strong reasoning capabilities:
| Use Case | Recommended Models |
|---|---|
| Zero-cost local (recommended) | BitNet b1.58 |
| Air-gapped / high-security | BitNet b1.58 |
| Local with more options | Ollama (Llama 3.2, Mistral 7B) |
| Cloud (optional) | GPT-4o, Claude, Llama via Bedrock |
Verifying Configuration
# Check that ReductrAI can reach your LLM
reductrai status
# Look for "LLM: connected" in output
reductrai status
# Look for "LLM: connected" in output