Cohort retention
Cohort retention is a method of measuring customer retention by grouping customers by their acquisition period (typically month) and tracking the proportion still active in each subsequent period. The output is a triangular table where rows are cohorts, columns are months-since-acquisition, and each cell is the percentage of the cohort still active.
Why aggregate retention metrics lie
A 95% monthly retention rate sounds great. But if your most recent cohort retains at 78% while cohorts from 18 months ago retain at 99%, the aggregate hides a deteriorating product — because the high-retention legacy cohorts dominate the denominator. Cohort analysis strips that effect away by forcing like-for-like comparisons: only customers at their 6th month versus other customers at their 6th month.
This is especially dangerous during growth phases. A company adding 20% more customers per month sees its aggregate retention rise even if product quality is falling, because large young cohorts (with no time yet to churn) swamp the calculation. Cohort retention is the only metric that catches this early.
Logo retention vs revenue retention
Logo retention (also called customer retention rate) counts customers or accounts: it answers "what fraction of January's customers are still paying in April?" Net revenue retention (NRR) substitutes revenue for headcount and captures expansion: a cohort that had £10,000 MRR in month 0 and has £11,500 MRR in month 12 has 115% NRR even if two accounts churned, because the surviving accounts expanded. NRR > 100% is the gold standard for SaaS — it means the cohort grows itself without any new sales.
Gross revenue retention (GRR) is NRR minus expansion. It can never exceed 100% and measures pure churn: what fraction of the original revenue base survived. Investors use GRR to understand the floor of the business.
How to build a cohort table in Excel
The simplest approach uses a pivot table on a transaction log. You need two columns: customer_id and transaction_date. Steps:
- Create a column
cohort_month= the first transaction_date for each customer_id (use MINIFS in Excel 2016+). - Create
month_offset= DATEDIF(cohort_month, transaction_date, "M"). - Pivot: rows = cohort_month, columns = month_offset, values = COUNT DISTINCT customer_id.
- Divide each cell by the month-0 count for that row to get a percentage.
- Apply a conditional colour scale: green at 100%, red at 0%. The shape of the gradient across each row is your retention curve.
For revenue retention, substitute SUM(revenue) for COUNT DISTINCT customer_id in step 3.
DataHub Pro's free cohort retention tool automates all five steps: drop in a CSV transaction log and it produces the coloured heatmap instantly, with both logo and revenue views.
Retention benchmarks by industry
These are approximate medians across B2B SaaS, e-commerce, and consumer subscription. Wide variance exists within each category — use them as orientation, not targets.
| Segment | M1 | M3 | M6 | M12 |
|---|---|---|---|---|
| B2B SaaS (SMB) | 88–94% | 82–90% | 78–87% | 70–82% |
| B2B SaaS (Enterprise) | 95–99% | 93–98% | 91–97% | 88–95% |
| E-commerce (repeat purchase) | 25–40% | 15–28% | 12–22% | 8–18% |
| Consumer subscription (media/fitness) | 55–75% | 40–60% | 30–50% | 22–40% |
How to read a cohort heatmap
Look across each row (horizontally). Are your most recent cohorts (bottom rows) retaining better than older ones at the same offset? If yes, your product is improving. If no, something recently changed for the worse — a pricing change, a competitor launch, a product regression.
Look down each column (vertically). At month 3 (for example), is retention stable across all cohorts, or declining over time? A downward trend in a column means the same life-stage is becoming harder to retain, regardless of when a customer joined.
Find the asymptote. Most SaaS cohorts flatten after month 6–9 — a stable core of highly-engaged customers stops churning at all. The height of this floor predicts your long-run retention and is the most important single number in the whole table.
How to improve cohort retention
The standard interventions, roughly in order of ROI:
- Improve onboarding to M1. The sharpest drop is almost always M0→M1. A structured activation sequence (in-product checklist, triggered emails, a single "aha moment" within 7 days) moves M1 retention by 10–20 percentage points at most companies.
- Identify and replicate "sticky" behaviour. Cohorts with high retention almost always share a specific behaviour in their first 30 days. Find it with feature-usage cohorts, then build the product and onboarding around triggering that behaviour.
- Add expansion vectors. Seat growth and feature upsells can drive NRR > 100% even if logo retention is flat. This doesn't fix churn, but it changes the economic equation completely.
- Close the save loop. A cancellation flow with a pause option, discount, or plan downgrade saves a meaningful fraction of churning customers. The friction of cancellation is a legitimate retention lever.
Common mistakes
1. Calendar quarters instead of cohort quarters. Q1 retention doesn't mean "customers who joined in Q1 measured in Q1" — it often gets conflated with all customers active in Q1. Always anchor the quarter to the customer's join date.
2. Comparing cohorts of very different sizes without showing absolute counts. A 95% retention rate on a 20-customer cohort is noise. Show the N alongside the percentage.
3. Ignoring seasonality. A November e-commerce cohort driven by Black Friday acquisition always looks bad in February — those customers were deal-hunters, not loyal customers. Seasonality-adjusted cohort analysis separates acquisition quality from product quality.
4. Not separating new-business from expansion. If you're measuring revenue retention and high-value expansions are masking churn in the SMB tail, the NRR number is misleading. Segment by initial contract size or customer type.
Alternatives to cohort retention analysis
Kaplan-Meier survival curves are the statistical cousin of cohort tables. They handle irregular time intervals and censored observations (customers still active) more rigorously. Preferred in academic research and clinical analogy work; cohort tables are more intuitive for business stakeholders.
Engagement-based cohorts group users by the first feature they used or the acquisition channel, rather than by join date. These answer product questions ("do users who complete the tutorial retain better?") rather than business-calendar questions.
Related
- Tutorial: Cohort analysis in Excel
- Free cohort retention tool
- Full glossary
- Compare AI tools for Excel
