Anomaly detection
Anomaly detection is the identification of values in a data series that deviate substantially from expected behaviour. Methods range from simple statistical thresholds (rolling z-score) to time-series-specific (STL decomposition + threshold) to machine-learning (isolation forests, autoencoders). The simplest defensible method is the rolling z-score: compute a moving mean and standard deviation, flag any value whose distance from the mean exceeds a threshold (typically 2.5 or 3 standard deviations).
Statistical thresholds
Z-score threshold of 2.5 corresponds to ~1.2% false-positive rate; 3 corresponds to ~0.3%. Use 2.5 for sensitive monitoring, 3 for noisy data where you only want to catch flagrant outliers.
Why a fixed threshold fails
An absolute threshold like "flag if value > 1000" stops working the moment your data scales up. Z-score is unitless — it's relative to the recent typical noise level. A 2.5σ spike is just as anomalous on £1k revenue as on £1m.
Seasonality is the silent killer
Naive z-score on weekly/daily data flags every Monday as anomalous because Mondays are predictably different. Two fixes: per-bucket detection (run the algorithm separately for each day-of-week) or deseasonalise (subtract a seasonal baseline before computing the z-score).
Window size selection
30 days is the sweet spot for daily data. 7 days is too reactive (high false-positive rate). 90 days is too slow (misses recent anomalies).
Alternatives
STL (Seasonal-Trend decomposition using LOESS) for series with strong seasonality. Isolation Forest for multi-dimensional anomaly detection. Prophet's built-in anomaly detector for series where you have a known seasonal calendar. For monitoring single business metrics, rolling z-score is hard to beat.
