Regression Analysis in Excel — Linear, Multiple & How to Read the Output (2026)
Regression answers the question every analyst eventually asks: how does one thing drive another, and can I predict it? In 40 minutes you’ll run regression in Excel three ways — a scatter trendline, live functions (SLOPE, INTERCEPT, RSQ, FORECAST.LINEAR), and the full Data Analysis ToolPak report — plus multiple regression with LINEST and how to read p-values and R-squared without a statistics degree. Interactive demo included.
TL;DR
Fastest: scatter chart → right-click points → Add Trendline → Linear → tick Display Equation + R². Live cells: =SLOPE(Y,X), =INTERCEPT(Y,X), =RSQ(Y,X), predict with =FORECAST.LINEAR(newX,Y,X). Full stats: enable the Analysis ToolPak → Data → Data Analysis → Regression (gives p-values, CIs). Multiple predictors: =LINEST(Y,Xcols,TRUE,TRUE). A p-value < 0.05 = significant.
Contents
What is Regression Analysis?
Regression analysis is a statistical method for modelling the relationship between a dependent variable (the thing you want to explain or predict, called Y) and one or more independent variables (the things you think drive it, called X). At its simplest, it fits a straight line through a cloud of points so that the line is as close as possible to all of them — the “line of best fit” — and gives you an equation you can use to predict Y from X.
That line is defined by two numbers: a slope (how much Y changes for each one-unit increase in X) and an intercept (the value of Y when X is zero). If advertising spend is X and sales is Y, a slope of 3 means every extra £1 of ad spend is associated with £3 more sales, on average. Excel finds this line using ordinary least squares — it picks the slope and intercept that minimise the total squared vertical distance between the line and the actual points.
Two questions always follow the line. First, how good is the fit? — answered by R-squared, the share of Y’s variation the line explains. Second, is the relationship real or just noise? — answered by the p-value, which tells you how likely you’d see this relationship by chance if there were truly none. A confident regression result needs both: a meaningful slope, a decent R-squared, and a low p-value. This guide builds all three, from a quick chart trendline up to a full multi-variable model.
One caution worth internalising before you start: regression measures association, not causation. A strong fit between ice-cream sales and drownings does not mean one causes the other (both rise with hot weather). Regression is a powerful tool for prediction and for quantifying relationships, but the causal story always comes from your understanding of the domain, not from the R-squared.
Before You Start
You need paired numeric data
Regression needs at least one X column and one Y column of numbers, with each row a matched observation. For reliable results aim for a reasonable sample — a dozen points can hint at a relationship, but a few dozen or more makes the slope and significance trustworthy.
Plot before you fit
Always eyeball a scatter chart first. Regression will happily fit a straight line to curved, clustered, or outlier-ridden data and report a number — but that number will mislead. Looking at the cloud of points tells you whether a straight line is even the right model.
Mind the argument order
Excel’s SLOPE, INTERCEPT, and RSQ all take the Y range first, X range second — the opposite of how you read “X then Y” on a chart. Getting this backwards is the single most common regression mistake in Excel and produces a plausible-looking but wrong slope.
1.Step 1 — Plot a Scatter Chart
Select your X and Y columns (X on the left) and go to Insert → Charts → Scatter (the first, plain option — not a line chart). You now see your data as a cloud of points. Before fitting anything, read the cloud: do the points trend up or down together? Is the pattern roughly straight, or curved? Are there obvious outliers sitting far from the rest? This visual check is not optional — it tells you whether linear regression is appropriate at all.
2.Step 2 — Add a Linear Trendline
Right-click any data point → Add Trendline. In the Format Trendline pane, choose Linear, then scroll down and tick Display Equation on chart and Display R-squared value on chart. Excel draws the line of best fit and prints something like y = 2.84x + 15.3 and R² = 0.78 right on the chart.
That one action is a complete simple regression: the equation gives you the slope (2.84) and intercept (15.3), and R² (0.78) tells you the line explains 78% of the variation in Y. For a quick look, this is often all you need.
3.Step 3 — Slope and Intercept as Live Cells
The chart equation is static text. To get the coefficients as live values you can use in other formulas, use the worksheet functions (remember: Y first, X second):
Now you can predict Y for any X yourself with =slope*X + intercept, and the prediction updates automatically as your data changes — ideal for a live model or dashboard where the chart equation cannot reach.
4.Step 4 — Measure Fit with R-squared
R-squared is the headline measure of how well the line fits. Get it in a cell with:
It returns a number between 0 and 1: 0.78 means 78% of the variation in Y is explained by X. What counts as “good” depends entirely on the field — in physics you might expect 0.99, in human behaviour 0.3 can be meaningful. And resist R-squared worship: a model can have a high R² and still be useless if it violates regression’s assumptions or is fit to a tiny sample. Read it alongside the scatter plot and the p-values, never alone.
| Function | Returns |
|---|---|
| SLOPE(Y,X) | The slope (rise per unit of X) |
| INTERCEPT(Y,X) | Y when X = 0 |
| RSQ(Y,X) | R-squared (0–1, fit quality) |
| CORREL(Y,X) | Correlation coefficient (−1 to 1) |
| FORECAST.LINEAR(x,Y,X) | Predicted Y for a new X |
5.Step 5 — Forecast with FORECAST.LINEAR
To predict Y for a value of X without assembling the equation yourself, use:
where E2 holds the new X value. Excel fits the regression internally and returns the predicted Y in one step. Put your scenario X values in a column and drag this down to get a prediction for each. Remember the cardinal rule of forecasting: predicting within the range of your data (interpolation) is reasonable; predicting far beyond it (extrapolation) assumes the straight-line relationship continues, which it often does not.
6.Step 6 — The Data Analysis ToolPak Regression
For the full statistical picture, use the ToolPak. Enable it once via File → Options → Add-ins → Manage: Excel Add-ins → Go → tick Analysis ToolPak. A Data Analysis button appears on the Data tab. Click it → Regression, set the Y (Input Y Range) and X (Input X Range), tick Labels if you included headers, and choose an output location.
Excel produces a labelled report with everything: the coefficients (slope and intercept), their standard errors, t-statistics, p-values, the 95% confidence intervals, R-squared and Adjusted R-squared, and an ANOVA table with the overall F-test. This is the output you put in front of a decision-maker or a write-up. Its one limitation: it is a static snapshot — it does not recalculate when the data changes, so re-run it after any data update.
7.Step 7 — Interpreting p-values and Significance
Each coefficient in the ToolPak output has a p-value. It answers: if this variable truly had no relationship with Y, how likely is it that we’d see a slope this large by pure chance? The convention is a threshold of 0.05: below it, the relationship is “statistically significant” and the variable likely belongs; above it, the variable may just be noise and a candidate to drop.
Treat 0.05 as a guideline, not gospel. A tiny p-value on a trivially small slope tells you the effect is real but unimportant; a borderline p-value on a large, sensible effect with a small sample may still matter. Read the p-value, the size of the coefficient, the confidence interval, and your domain knowledge together — significance and importance are not the same thing.
8.Step 8 — Multiple Regression with LINEST
Real outcomes have several drivers. Multiple regression predicts Y from many X columns at once. The worksheet way is the LINEST array function, with all predictor columns side by side as the X range:
The third argument TRUE calculates the intercept normally; the fourth TRUE returns the full set of regression statistics. In Microsoft 365 it spills a block of coefficients and stats automatically; in older Excel, select a wide-enough output range and press Ctrl+Shift+Enter. The coefficients come back in reverse column order (rightmost X first), which trips everyone up the first time — the ToolPak’s labelled Regression output is friendlier if you find LINEST’s layout confusing.
With multiple predictors, watch for multicollinearity — when two X variables are themselves highly correlated (e.g. “ad spend” and “impressions”), the model struggles to separate their effects and the individual coefficients become unstable. If two predictors tell the same story, keep one.
Try It: Live Regression Demo
Below is a scatter of sample data with its least-squares line drawn through it — the same fit Excel computes with SLOPE and INTERCEPT. Drag the noise slider to scatter the points more or less, and watch R-squared fall as the cloud spreads and the line explains less of the variation. Hit New sample to regenerate the data. The equation and R² update live, exactly like a chart trendline.
Least-Squares Regression Visualiser
More noise → lower R². The pink line is the same line of best fit Excel draws.
A Worked Example: Ad Spend vs Sales
Suppose you have twelve months of monthly advertising spend (X) and sales (Y). You run =SLOPE(Sales,Spend) and get 4.2, =INTERCEPT(Sales,Spend) gives 18,000, and =RSQ(Sales,Spend) returns 0.81. The model reads: Sales = 4.2 × Spend + 18,000, and it explains 81% of the month-to-month variation in sales.
Interpret each number in plain business terms. The slope of 4.2 says every extra £1 of ad spend is associated with £4.20 of additional sales — a healthy return, though remember it is association, not proof of causation. The intercept of 18,000 is the modelled baseline sales with zero advertising, which is plausible if you have organic demand. The R² of 0.81 is strong for marketing data, where 0.5–0.7 is common, suggesting advertising is a major but not sole driver. Running the same data through the ToolPak would add a p-value — if it came back at, say, 0.003, you would conclude the relationship is highly unlikely to be chance and confidently use the equation to budget next month’s spend. To forecast sales for a planned £25,000 spend: =FORECAST.LINEAR(25000,Sales,Spend) returns roughly £123,000. Crucially, £25,000 should sit within your historical spend range — predicting sales for a £200,000 spend you have never tried would be extrapolation, and the linear relationship may not hold at that scale.
Common Errors & Fixes
The slope looks wrong / inverted
You swapped the arguments. SLOPE, INTERCEPT, and RSQ take Y first, then X. Re-check the order — this is the #1 regression error in Excel.
“Data Analysis” button is missing
The Analysis ToolPak is not enabled. File → Options → Add-ins → Manage: Excel Add-ins → Go → tick Analysis ToolPak. On Mac it is under Tools → Excel Add-ins.
LINEST returns one number, not a block
In older Excel you must select the full output range and press Ctrl+Shift+Enter to enter it as an array. In Microsoft 365 it spills automatically — make sure the cells to the right and below are empty.
High R-squared but useless predictions
You may be overfitting (too many predictors for the sample), extrapolating beyond your data, or fitting a line to a curved relationship. Plot the residuals; if they show a pattern, a straight line is the wrong model.
Regression & Forecasting, Automatically
DataHub Pro runs regression, trend, and forecasting on your spreadsheet in seconds — coefficients, R-squared, and predictions in a shareable dashboard, no ToolPak or array formulas required. Upload your data and see the fit instantly.
Try DataHub Pro free →Frequently Asked Questions
How do I run a regression analysis in Excel?
What is the formula for linear regression in Excel?
=SLOPE(Y_range, X_range) gives the slope, =INTERCEPT(Y_range, X_range) gives the intercept, and you predict with =SLOPE*X + INTERCEPT or more directly with =FORECAST.LINEAR(new_x, Y_range, X_range). Note that SLOPE and INTERCEPT take the Y (dependent) range first and the X (independent) range second, the reverse of how most people read a chart.What does R-squared mean in regression?
=RSQ(Y_range, X_range). Higher is generally better, but a high R-squared does not prove causation, and what counts as 'good' depends heavily on the field.How do I enable the Data Analysis ToolPak in Excel?
How do I do multiple regression in Excel?
=LINEST(Y_range, X_range, TRUE, TRUE), where X_range covers all predictor columns side by side. In Microsoft 365 it spills a block of coefficients and statistics automatically; in older Excel, select the output range and press Ctrl+Shift+Enter. Alternatively, the Data Analysis ToolPak's Regression tool accepts a multi-column X range and produces a fuller, labelled report.What is a good p-value in regression?
What is the difference between SLOPE/INTERCEPT and the Regression ToolPak?
Can Excel do non-linear regression?
Related Tutorials
- DataHub Pro — Moving Average in Excel — another way to model and smooth a trend.
- DataHub Pro — Holt-Winters Forecasting — trend- and season-aware forecasting beyond a straight line.
- DataHub Pro — Variance Analysis in Excel — understand the variation regression tries to explain.
- DataHub Pro — KPI Dashboard in Excel — put a fitted forecast on a dashboard.
- DataHub Pro — All Excel analytics tutorials →
