How to do anomaly detection in Excel

If you've ever stared at a column of daily numbers wondering whether yesterday's spike is a problem or noise, this is the method. Rolling z-score with a sensitivity threshold — what almost every commercial "anomaly detection" feature does under the hood, in three Excel formulas.

Dr Waqas Rafique Dr Waqas Rafique · Founder & CTO · PhD, Statistical Machine Learning
· About

TL;DR

The method: compute a rolling mean and rolling standard deviation, then flag values whose z-score exceeds a threshold (typically 2.5 or 3).

Three Excel formulas: AVERAGE over a trailing window for the rolling mean, STDEV.S for the rolling standard deviation, (value - mean) / std for the z-score.

Seasonality matters. Always deseasonalise daily/weekly data first, otherwise you'll flag every Monday morning as anomalous.

1.Lay out the time series

Column A: dates. Column B: values (any metric). Sort ascending by date. At least 60 data points for a 30-day rolling window.

2.Compute the rolling mean

Column C, starting at row 31: =AVERAGE(B2:B31). Drag down. Window size of 30 is a sensible default for daily data; use 7 for sensitive short-term anomalies, 90 for slow drift.

3.Compute the rolling standard deviation

Column D: =STDEV.S(B2:B31). Same window as the mean.

4.Compute the z-score

Column E: =(B31 - C31) / D31. This is the number of standard deviations from the rolling mean.

5.Flag anomalies via threshold

Column F: =IF(ABS(E31) > 2.5, "FLAG", ""). Threshold 2.5 flags about 1 in 80 normal points (~1% false-positive rate). Threshold 3 is stricter (~0.3%).

6.Visualise on a chart

Insert a line chart on column B. Add column C as a second line for the rolling mean. Highlight flagged rows manually or with conditional formatting.

Dealing with seasonality first

If your data has weekly or yearly seasonality (which most real data does), the naive rolling z-score will flag the predictable Monday spike as an anomaly. Two fixes:

Option 1 — per-bucket detection. Run the algorithm separately on Mondays-only, Tuesdays-only, etc. This isolates day-of-week from anomaly.

Option 2 — deseasonalise then detect. Subtract a seasonal baseline (e.g. average value for that day-of-week across the past 12 weeks) from every observation before computing the z-score. Anomalies in the residual are the real anomalies.

Why z-score and not a fixed threshold

A fixed threshold ("flag if value > 100") fails the moment your data scales up. Z-score is unitless — it's relative to the recent typical noise level. A 2.5σ event is just as anomalous on £1k revenue as it is on £1m.

Common mistakes

1. Using STDEV.P (population) instead of STDEV.S (sample). For rolling windows of size <100, STDEV.S gives the right answer. STDEV.P slightly underestimates the standard deviation.

2. Window too short. 7-day windows are reactive but produce many false positives because the std is itself noisy. 30 days is usually the sweet spot.

3. Window too long. 90-day windows miss recent anomalies because the rolling mean takes too long to adjust.

4. Including the current point in its own window. The cell mustn't reference itself or it inflates the std and reduces detection sensitivity. Use trailing window, not centred.

Skip the formulas

The cohort retention tool, forecasting calculator, and anomaly detector all run client-side, no signup. For everything end-to-end, DataHub Pro — £19/user/month, 14-day free trial.

Compare with all 12 tools →

Related

Frequently asked questions

How do I find outliers in an Excel column?
Compute a rolling mean and standard deviation, then z-score. Flag any z-score above 2.5 or 3. Three formulas — see the tutorial.
What's a good z-score threshold?
2.5 for sensitive monitoring (~1% false positive rate). 3 for stricter monitoring (~0.3%). Don't go above 3.5 — you'll miss real anomalies.
Why do my Mondays look like anomalies?
Because they probably are *predictably* different. The naive z-score doesn't know about seasonality. Either deseasonalise before applying the detector, or run the detection separately per day-of-week.