How to do cohort retention analysis in Excel

Cohort retention is the cleanest way to see whether your product is getting better or worse at keeping customers. The numbers your CRM shows you are aggregates and they hide the truth. Here's how to build the cohort table from a single transactions sheet, in pure Excel.

Dr Waqas Rafique Dr Waqas Rafique · Founder & CTO · PhD, Statistical Machine Learning
· About

TL;DR

Cohort retention groups customers by acquisition month and tracks them across subsequent months. Output: a triangular table where rows are cohorts and columns are months-since-acquisition.

Three steps in Excel: (1) tag every order with its customer's first-month cohort, (2) tag every order with its offset from the cohort start, (3) PivotTable with cohort on rows, offset on columns, count of unique customers as values.

Don't have time? The free in-browser cohort tool does steps 1-3 in 10 seconds.

1.Lay out the transactions sheet

Three columns: customer_id (A), order_date (B), and optionally order_value (C). One row per order. Sheet should have at least 6 months of data and 50+ customers for a meaningful cohort table.

2.Add a 'cohort' helper column

Column D: =EOMONTH(MINIFS(B:B,A:A,A2),0)+1. This finds the customer's first order date and rounds to the start of that month. Drag down. Now every row is tagged with the customer's acquisition cohort.

3.Add a 'months since cohort' helper column

Column E: =DATEDIF(D2, EOMONTH(B2,0)+1, "M"). This is the order's offset (in months) from the customer's cohort start. M0 = same month, M1 = next month, etc.

4.PivotTable: cohort on rows, offset on columns, COUNTUNIQUE customers as values

Cohort (D) on Rows, Months-since-cohort (E) on Columns, customer_id (A) on Values with Distinct Count. Result: triangular cohort retention table.

5.Compute retention as a percentage

Divide each cell by the M0 value of its row. M0 is always 100% by definition. M1, M2, M3 are the retention curve.

6.Read the heatmap

Look across rows: are recent cohorts retaining better than older ones (product is improving) or worse (product is degrading)? Look down columns: at month 3 retention, is it stable cohort-over-cohort or trending?

Logo retention vs revenue retention

The PivotTable above counts customers (logo retention). For SaaS, you also want net revenue retention (NRR): replace COUNTUNIQUE with SUM of order_value, and divide by the cohort's M0 revenue. NRR > 100% means the cohort is expanding net of churn — the holy grail.

Common mistakes

1. Using calendar quarters not cohort quarters. A customer acquired in March is in the Q1 cohort if you bucket by calendar quarter, but their offset "month 1" is April. Mixing the two breaks the offset column.

2. Not sorting by date before computing first-month. If your sheet isn't sorted, MINIFS still works (it scans the whole column) but it's slow on large sheets.

3. Comparing cohorts of different sizes without context. A 5-customer cohort with 4 retained looks like 80% retention; a 500-customer cohort with 400 retained also looks like 80%. The first is noise; the second is signal. Show absolute counts alongside percentages.

4. Ignoring seasonality. An e-commerce cohort acquired in November will look like it's retaining badly in February. It's the seasonal trough, not your product getting worse. Compare same-offset across cohorts that experienced the same calendar months.

Skip the formulas

The cohort retention tool, forecasting calculator, and anomaly detector all run client-side, no signup. For everything end-to-end, DataHub Pro — £19/user/month, 14-day free trial.

Compare with all 12 tools →

Related

Frequently asked questions

Do I need SQL for cohort analysis?
No — Excel is enough for up to ~10,000 customers. Use a PivotTable with cohort on rows and offset on columns. Beyond that, scripting is faster.
Can I do cohort analysis without a date column?
No — you need at least one date per row to define cohorts and offsets.
What's the difference between rolling and fixed cohorts?
Fixed cohorts use a calendar period (e.g. January 2026 cohort). Rolling cohorts use the customer's signup as time zero. For most retention analysis, rolling is what you want.