Regression Analysis is one most used statistical methods in practical applications. It is used to study trends of data and needs a minimum of 2 variables for the simplest analysis.
Which multiple variables?
In regression analysis we need at least 2 variables for the simplest calculations. The first variable is called Dependent variable and the second is called Independent variable. We use the calculations to see if changing the values of independent variable will affect the dependent variable.
Y = mX + b
This is the standard formula for a line in basic geometry.
The regression formula shown below will be very similar to this formula
Y is the Dependent variable and X is the independent variable in this case
The Dependent Variable is also known by other common names such as
- response variable
- outcome variable
- explained variable
- predicted variable
- measured variable
The Independent Variable is also known by other terms such as
- Control variable
- exposure variable
- input variable
These various terms may become confusing for new statisticians, so we have compiled the list above to ease that burden.
This analysis can be done using various methods depending on data collected and number of variables involved. You can click here to learn more about different types of regression.
We will be focusing on the Linear method as it is most suited for beginners in statistics.
Step by Step: Simple Linear Method
To better understand linear regression, we will solve a problem step by step.
Given Data: a sample of 9 cars randomly chosen were driven at different speeds. The mileage,and speed of the cars were recorded as follows
The equation for this data will be
Y = B0 + B1X
Where Y = mileage
B0 = Constant or Intercept value
B1 = Coefficient of Variable X1
X = Car Speed Variable
We get B1 using the following Formula
This formula has summation operators so we will have to do the calculations of the sums in the table as follows:
The Column XY is the product of mileage and car speed
The Column X^2 is the Square of Car speed
The column Y^2 is square of mileage
B1 = ((9*17593) – (572*291)) / ((9*38252) – (572)^2) = -8115 / 17084 = -0.45701
We get B0 using the following formula
Substituting those values in the formula we get
B0 = ((291*38252) – (291*572)) / ((9*38252) – (572)^2) = 1068136 / 17084 = 62.52259
We now know the Regression formula for this problem
Y = 62.52259 – 0.45701*X
We will make a scatter plot in excel of our data with a trend line to compare the solution calculated.
You can learn more about regression and frequently used terms here.