Storage & Compression
Intelligent tiered storage with 91-95% compression. Your telemetry stays local in DuckDB. Only ~2KB summaries leave for AI analysis.
Zero-Knowledge Architecture: Raw telemetry never leaves your infrastructure. The AI only sees statistical summaries, not your actual logs, traces, or metrics.
How It Works
ReductrAI stores all telemetry locally in DuckDB, an embedded analytical database. Data automatically flows through three storage tiers based on age, with intelligent compression at each stage.
(Raw data) (< 1 hour) (1h - 7 days) (> 7 days)
|
v
~2KB Summary -> Cloud AI -> Investigation Results
Storage Tiers
HOT Tier (< 1 hour)
Raw, uncompressed data for real-time anomaly detection. Fastest query performance for active incident investigation. Data is kept in native DuckDB format for sub-second queries.
WARM Tier (1 hour - 7 days)
Compressed storage with 91-95% reduction. V2 Compression Engine applies dictionary encoding, delta compression, and columnar transformation. Still queryable, slightly higher latency.
COLD Tier (> 7 days)
Maximum compression for long-term retention. Data is preserved for compliance and historical analysis. Query latency is higher but storage cost is minimal. Retention period depends on your license tier.
Retention by License
Data retention varies by license tier. After the retention period, data is automatically purged from local storage.
| License Tier | Retention Period | Storage Limit |
|---|---|---|
| FREE | 30 days | 10 GB |
| PRO | 90 days | 50 GB |
| BUSINESS | 180 days | 200 GB |
| ENTERPRISE | 365 days | Unlimited |
V2 Compression Engine
The open-source compression engine achieves 91-95% storage reduction through multiple techniques. This is the "proof" that your data stays local - you can audit every line of code.
Compression by Data Type
| Data Type | Technique | Compression |
|---|---|---|
| Spans / Traces | SpanPatternCompressor (delta encoding, dictionary) | 94-95% |
| Logs | ContextualDictionaryCompressor (template extraction) | 91-92% |
| Metrics | TimeSeriesAggregator (series grouping, delta timestamps) | 91% |
| Events / JSON | SemanticCompressor (columnar transform) | 91-93% |
How Compression Works
- Dictionary Encoding - Repeated strings (service names, error messages) replaced with integer indices
- Delta Encoding - Timestamps stored as deltas from a base time, reducing bytes per value
- Columnar Transformation - Row-based data converted to column format for better compression ratios
- Template Extraction - Log patterns extracted and stored once, with variables referenced
- Gzip Final Pass - Standard compression applied to the transformed data
Open Source: The compression engine is part of the open-source agent. Security teams can audit the code at github.com/reductrai/agent
What Goes to the Cloud?
Only statistical summaries (~2KB per service) are sent to the cloud for AI analysis. Here's exactly what the AI sees:
What the Cloud NEVER Sees
- Actual log messages or error text
- Request/response payloads
- User IDs, emails, tokens, or PII
- Database queries or results
- Headers, cookies, or authentication data
- Raw trace spans or metric samples
Verify It Yourself: The agent is open source. Run tcpdump or wireshark to inspect exactly what leaves your network. We prove it, not just claim it.
Local Queries with DuckDB
All your telemetry is stored locally in DuckDB and can be queried directly. The ReductrAI agent provides a query command:
Useful Queries
Storage Location
By default, ReductrAI stores data in ~/.reductrai/. You can customize this with the REDUCTRAI_DATA_DIR environment variable.
Storage Recommendation: Use SSD storage for the data directory. The HOT tier benefits significantly from fast I/O for real-time anomaly detection.