# document.write (document.title)

 Math Help > Statistics > Linear Regression Inferences > Linear Regression

Draw a line through the middle of a cloud of data points that is a "best fit" to the data.

# Linear Regression

This explanation looks at regression solely as a descriptive statistic: what is the line which lies "closest" to a given set of points. "Closest" means minimizing the sum of the squared y (vertical) distance of the points from the least squares regression line. I won't derive the formula, merely present it and then use it. Data is given as a set of points in the plane, i.e., as ordered pairs of x and y values.

### Statistical Formulae

X-bar, written as an X with a line over it, is the mean (average) of the x-values.

Y-bar, a Y with a line over it, is the mean of the y-values.

SSxx is the sum of the squares of the x-deviations.  SUM (xi-(X-bar))²

SSyy is the sum of the squares of the y-deviations.  SUM (yi-(Y-bar))²

SSxy is SUM (xi-(X-bar))(yi-(Y-bar))

b1 = SSxy/SSxx

b0 = (Y-bar) - b1(X-bar)

The least squares regression line is y-hat = b0 + b1x

(y-hat is written as a y with a circumflex over it.)

Data Values
x  y
2  -5
4  14
9  -1
13  38
16  11

### Statistical measures of this data:

 X-bar = 8.8 Y-bar = 11.4 SSxx = 138.8 SSyy = 1137.2 SSxy = 205.4 b1 = 1.48 b0 = -1.622

The formula for the least squares regression line is

y-hat = b0 + b1x

So in our example, where b0=-1.622 and b1=1.48, the least squares regression line is

y-hat = -1.622 + (1.48)x

### Related pages in this website

The webmaster and author of this Math Help site is Graeme McRae.