Introduction to Regression Analysis #1

Regression Analysis is one most used statistical methods in practical applications. It is used to study trends of data and needs a minimum of 2 variables for the simplest analysis.

Which multiple variables?

In regression analysis we need at least 2 variables for the simplest calculations. The first variable is called Dependent variable and the second is called Independent variable. We use the calculations to see if changing the values of independent variable will affect the dependent variable.

Y = mX + b
This is the standard formula for a line in basic geometry.
The regression formula shown below will be very similar to this formula

Y is the Dependent variable and X is the independent variable in this case

The Dependent Variable is also known by other common names such as

1. response variable
2. outcome variable
3. target
4. explained variable
5. predicted variable
6. regressand
7. measured variable

The Independent Variable is also known by other terms such as

1. Regressor
2. Control variable
3. predictor
4. exposure variable
5. input variable

These various terms may become confusing for new statisticians, so we have compiled the list above to ease that burden.

Different types

This analysis can be done using various methods depending on data collected and number of variables involved. You can click here to learn more about different types of regression.

We will be focusing on the Linear method as it is most suited for beginners in statistics.

Step by Step: Simple Linear Method

To better understand linear regression, we will solve a problem step by step.

Given Data: a sample of 9 cars randomly chosen were driven at different speeds. The mileage,and speed of the cars were recorded as follows

The equation for this data will be

Y = B0 + B1X
Where Y = mileage
B0 = Constant or Intercept value
B1 = Coefficient of Variable X1
X = Car Speed Variable

Step 1

We get B1 using the following Formula

This formula has summation operators so we will have to do the calculations of the sums in the table as follows:

The Column XY is the product of mileage and car speed
The Column X^2 is the Square of Car speed
The column Y^2 is square of mileage

B1 = ((9*17593) – (572*291)) / ((9*38252) – (572)^2) = -8115 / 17084 = -0.45701

Step 2

We get B0 using the following formula

Substituting those values in the formula we get

B0 = ((291*38252) – (291*572)) / ((9*38252) – (572)^2) = 1068136 / 17084 = 62.52259

Step 3

We now know the Regression formula for this problem

Y = 62.52259 – 0.45701*X

Step 4

We will make a scatter plot in excel of our data with a trend line to compare the solution calculated.