Largest Triangle Three Buckets: Downsampling Time-Series Data Without Losing Signal
The Core Problem
You have 10 million time-series data points - stock prices, server metrics, IoT sensor readings. Your charting library chokes at 50,000 points. The browser tab freezes. Users complain.
The naive solution: take every Nth point. The result: jagged lines, missing peaks, lost valleys. Critical anomalies disappear. Your dashboard lies.
Largest Triangle Three Buckets (LTTB) solves this. It's a downsampling algorithm that preserves the visual shape of your data by maximizing the area of triangles formed between consecutive points. You get smooth, accurate charts that load instantly.
What Is LTTB?
LTTB is a downsampling algorithm that reduces N points to M points (where M << N) while maintaining visual accuracy. The name describes exactly how it works:
- Largest Triangle: It finds the point that forms the largest triangle area with neighboring points
- Three Buckets: It divides data into buckets and looks at three buckets at a time
Think of it like this: imagine you're drawing a mountain range. Instead of drawing every grain of sand, you pick the peaks, valleys, and slopes that capture the mountain's shape. LTTB does this mathematically.
The Mathematical Insight
The algorithm maximizes the area of triangles formed by consecutive selected points. Large triangle areas mean significant visual change - exactly what human eyes need to perceive the data's shape correctly.
For three points forming a triangle, the area formula is:
area = |x₁(y₂ - y₃) + x₂(y₃ - y₁) + x₃(y₁ - y₂)| / 2
By selecting points that maximize these areas, we preserve visual features: sharp turns, peaks, valleys, and trend changes.
Why Use LTTB?
1. Visual Fidelity Over Statistical Precision
LTTB optimizes for what humans see, not statistical accuracy. When you're rendering a chart, you don't need every data point - you need the shape. LTTB preserves:
- Peaks and valleys: Extreme values stay visible
- Trend changes: Slope transitions remain clear
- Visual density: High-activity regions get more points
2. Predictable Performance
The algorithm is O(n) - it makes exactly one pass through your data. No sorting, no complicated data structures. For 1 million points downsampled to 1,000:
- Time complexity: O(n) - linear scan
- Space complexity: O(1) extra - processes in place (with output buffer)
- Processing time: ~10-50ms on modern hardware
3. Deterministic Results
Unlike sampling with randomness, LTTB always produces the same output for the same input. This matters for:
- Reproducibility: Charts look identical across page reloads
- Debugging: Issues are consistent and trackable
- Caching: You can cache downsampled results reliably
The LTTB Algorithm: Step by Step
Let me break down how it works before we code it.
Setup:
- You have N source points
- You want M target points (M < N)
- First and last points are always included (preserves range)
Process:
- Divide remaining N-2 points into M-2 buckets
- Always include the first point
- For each bucket:
- Look at the previously selected point (from bucket i-1)
- Look at the average point of the next bucket (bucket i+1)
- Select the point in current bucket (bucket i) that forms the largest triangle with these two points
- Always include the last point
Visual representation:
Bucket 1 Bucket 2 Bucket 3 Bucket 4
[...] [...] [...] [...]
↓ ↓ ↓ ↓
Point A Point B Point C Point D
For Bucket 2:
- Previous selected: Point A (from Bucket 1)
- Next bucket average: avg(Bucket 3 points)
- Select point from Bucket 2 that makes largest triangle
TypeScript Implementation
Here's a production-ready implementation:
/**
* Largest Triangle Three Buckets downsampling algorithm
*
* @param data - Source data points (must be sorted by x)
* @param threshold - Target number of points in output
* @returns Downsampled points that preserve visual shape
*/
Example Usage
// Generate sample time-series data: a sine wave with noise
// Original data: 10,000 points
;
`Original: points`;
// Downsample to 500 points
;
`Downsampled: points`;
`Reduction: %`;
// Use in your charting library
// chart.setData(downsampled); // Much faster rendering!
Real-World Example: Stock Price Visualization
Let's say you're building a stock charting application. You fetch intraday data - one price per minute for a year:
// Usage: 1 year of minute data (525,600 points) down to 2,000
; // 525,600 points
; // 99.6% reduction
// Chart renders instantly, all major price movements visible
Where to Use LTTB
1. Time-Series Visualization
Perfect fit:
- Financial charts (stocks, crypto, forex)
- Server metrics (CPU, memory, network)
- IoT sensor data (temperature, pressure, vibration)
- Analytics dashboards (user activity, sales trends)
Why it works: Time-series data is dense and continuous. LTTB preserves the temporal patterns humans need to spot trends and anomalies.
2. Real-Time Monitoring Dashboards
When streaming live metrics:
3. Historical Data Analysis
When users zoom out to see long time ranges:
// When user zooms in/out, re-downsample for current view
'zoom',;
4. Data Export/Transfer
Reduce payload size for client applications:
// API endpoint that returns downsampled data
'/api/metrics/:id', ;
Where NOT to Use LTTB
1. Statistical Analysis
Don't use for:
- Calculating averages, medians, or percentiles
- Standard deviation or variance
- Correlation analysis
- Any statistical computation
Why: LTTB optimizes for visual accuracy, not statistical properties. It intentionally biases toward extremes and changes, which skews statistics.
// BAD: Statistical analysis on downsampled data
;
;
// ❌ This average is WRONG - biased toward peaks/valleys
// GOOD: Calculate stats on full data
;
// ✅ Correct statistical average
2. Scientific Precision Requirements
Don't use for:
- Medical device data where every reading matters
- Financial audit trails (compliance requires complete data)
- Scientific experiments requiring exact measurements
- Legal or regulatory data (must be complete and unaltered)
Reason: Downsampling discards information. If you need provable accuracy or regulatory compliance, you can't afford to lose any data points.
3. Sparse or Non-Continuous Data
Don't use for:
- Event logs (sparse, discrete events)
- Transaction records (each record is unique)
- Categorical data (non-numeric or discrete categories)
- Already small datasets (< 1000 points)
Example of bad fit:
// BAD: Transaction log data
;
// Each transaction is important - can't downsample without losing critical info
// BAD: Sparse sensor data with gaps
;
// The algorithm assumes continuous data - gaps break this assumption
4. When You Need Exact Points
Don't use for:
- Min/Max calculations (unless you verify endpoints)
- Exact zero-crossing detection
- Precise integration or area-under-curve
- Anomaly detection algorithms
LTTB may miss the exact maximum if it falls inside a bucket but doesn't form a large triangle:
// Potential issue: Missing exact peak
;
;
// Might skip x=2 if it doesn't form largest triangle
// Use max aggregation instead for guaranteed peak capture
Trade-offs and Limitations
Memory vs. Visual Accuracy
LTTB gives you a knob: the threshold parameter. Higher threshold = more points = better visual accuracy but slower rendering and more memory.
Sweet spots by use case:
- Live monitoring: 500-1,000 points (updates every second)
- Historical analysis: 1,000-2,000 points (user-triggered)
- Export/reporting: 2,000-5,000 points (one-time generation)
Processing Cost
The O(n) scan isn't free:
// Benchmark on MacBook Pro M1
;
;
// Results:
// 10,000 points: 2ms
// 100,000 points: 15ms
// 1,000,000 points: 150ms
// 10,000,000 points: 1,500ms
For 10M points, 1.5 seconds is significant. Solutions:
- Pre-compute: Downsample on the backend, cache results
- Web Workers: Offload processing to background thread
- Progressive loading: Start with coarse downsample, refine on idle
// Web Worker approach
// main.ts
;
;
worker.onmessage =;
// lttb-worker.ts
self.onmessage =;
Loss of Statistical Properties
This is the big one. Downsampled data looks right but calculates wrong:
| Metric | Full Data | LTTB (1000pts) | Error |
|---|---|---|---|
| Mean | 50.0 | 51.2 | +2.4% |
| Std Dev | 10.5 | 12.8 | +21.9% |
| Min | 25.0 | 25.0 | 0% |
| Max | 75.0 | 75.0 | 0% |
Why: LTTB favors extremes. Standard deviation inflates because you're keeping more outliers relative to the mean.
Solution: Dual-path approach:
// Use visual for rendering, stats for calculations
;
dataset.visual;
;
Advanced Patterns
Multi-Resolution Storage
Store multiple resolutions for different zoom levels:
Streaming Incremental Updates
For real-time data, you can't reprocess everything on each new point:
Comparison with Other Downsampling Methods
| Method | Pros | Cons | Use Case |
|---|---|---|---|
| Every Nth | Simple, O(1) | Misses peaks/valleys | Never (use LTTB instead) |
| Random Sampling | Unbiased stats | Unpredictable visual | Statistical analysis |
| Min-Max | Preserves extremes | Loses smooth curves | Range-bound displays |
| LTTB | Visual fidelity, O(n) | Stats biased | Time-series charts |
| M4 | Very fast, extremes | Complex, requires binning | High-frequency data |
When to choose LTTB:
- You're rendering line charts
- Visual shape matters more than exact statistics
- Data is time-ordered and continuous
- You need consistent, reproducible results
Production Checklist
Before deploying LTTB in production:
- Verify input data is sorted by x-axis (time)
- Handle edge cases: empty arrays, single points, threshold >= data length
- Add input validation: non-null points, valid numbers
- Consider pre-computing and caching downsampled views
- Use Web Workers for large datasets (> 100k points)
- Keep full data for statistical calculations
- Document that downsampled data is for visualization only
- Add metrics to monitor processing time
- Test with real production data shapes (spikes, flat lines, noise)
Conclusion
LTTB solves a specific problem elegantly: making large time-series datasets renderable without losing visual meaning. It's not a general-purpose data compression algorithm - it's a visualization optimization.
The algorithm makes a clear trade: statistical accuracy for visual fidelity. If you're rendering charts, this trade is almost always worth it. Your users see the same patterns in 1,000 points that exist in 1,000,000 - but their browser doesn't crash.
Use it when you're drawing lines on screens. Don't use it when you're doing math. Keep these separate, and LTTB becomes a powerful tool in your performance optimization toolkit.
The implementation is straightforward - about 80 lines of TypeScript. The impact is immediate: charts that were unusable become instant. That's the mark of a good algorithm - simple idea, dramatic results.