Chapters

What is Regression?
Single Linear Regression
Multiple Linear Regression
MLR Estimators
MLR Interpretation
Variable Transformations
Interpretation with Transformed Variables
Problem 1
Solution to Problem 1
Problem 2
Solution to Problem 2

The best Maths tutors available

What is Regression?

Regression is a statistical method used to model the relationship between two or more variables. To understand regression, let’s start by understanding the different types of relationships two variables can have with each other.

The images above are examples of two variables graphed on a line plot. The lines going through the points represent the equation used to approximate the data points.

Graph	Relationship	Equation	Formula
1	Parabolic	Quadratic	$\text{[math]}$
2	Exponential	Exponential growth	$\text{[math]}$
3	Linear	Line	$\text{[math]}$

We can model data by finding the best line to represent the data. This is why regression is used, because it can be used to model the data that is available and predict the future.

Single Linear Regression

Simple linear regression is a type of regression. Simple linear regression, or SLR, involves only one dependent and one independent variable. Notice that the equation for SLR follows closely to that of a line. The image below illustrates where these variables are located in a regression equation.

The difference between the response and explanatory variables are summarized in the table below.

Notation	Variable	Definition
y, $\text{[math]}$	Response, dependent	Variable we want to predict
X, x	Explanatory, Independent	Variable used to predict response variable

As you may notice, there are two formulas for SLR. The difference between the two are explained in the table below.

Equation	Variables	Type	Data used
1	y, $\text{[math]}$ , X, $\text{[math]}$	Population SLR	Data on the entire population is used. The true population parameters are calculated.
2	$\text{[math]}$ , b, x	Sample SLR	Data using a sample from the population is used. True population parameters are estimated.

Multiple Linear Regression

Multiple linear regression, or MLR, is quite similar in definition to SLR. The difference between the two is that in an MLR model, more than one independent variable is used to estimate the dependent variable. Take a look at the equations below.

The two equations represent the same as the SLR equations, the top equation is the population MLR equation and the bottom equation is the sample MLR equation. The table below defined what each element in the formulas mean.

	Element	Description	Definition
1	y	Response variable	The variable we’re trying to predict
2	$\text{[math]}$ and $\text{[math]}$	Constant	The value of y when all x’s are zero
3	$\text{[math]}$ and b’s	Regression coefficients	The amount y increases or decreases given a 1 unit change in x
4	x’s	Independent variables	The variable used to predict y
5	$\text{[math]}$	Error	The random error, the part of y that isn’t explained by x

MLR Estimators

There are many different approaches you can use to estimate a MLR model. The most common approach is to use any program that has the capability of calculating an MLR model given a data set. Another rare approach is to calculate an MLR model by hand. While this is not convenient and can lead to errors of calculation, it can be helpful for someone trying to understand the concepts behind regression.

Estimating MLR regression coefficients is a bit more difficult than for SLR coefficients. However, the general idea is the same. The picture below shows the equations.

MLR Interpretation

When it comes to interpreting an MLR model, the intuition is the same as for a SLR model. However, because there are more explanatory variables, there are a few more things you should take into account. Check out the table below in order to get a better idea on how to interpret these variables given a MLR model with two independent variables.

Element	Interpretation
$\text{[math]}$	Value of y when all independent variables are zero
$\text{[math]}$	Value that y increases or decreases by given a change of 1 unit in x when b_{2} variables held constant
$\text{[math]}$	Value that y increases or decreases by given a change of 1 unit in x when b_{1} variables held constant

Variable Transformations

Transforming variables are a common operation that is completed before a MLR model is run. The reason why some variables are transformed can be:

To have the variable follow a better distribution
To create a new variable
To improve the appearance in visualizations

These are the most common reasons why variables are transformed. Some common transformations to perform on a variable are:

Logarithm
Square
Square root

Interpretation with Transformed Variables

Interpreting a MLR model with transformed variables depends on the type of transformation performed. Typically, square or square root transformations are easier because the number is still on the same scale. Logarithmic transformations are a bit more complex because they involve a logged scale. The table below summarizes the interpretation of models using logarithm transformed values.

Type	Model	Interpretation of Regression Coefficients
Log-log	y and x are log transformed	An 1% increase of x will lead to a ( $\text{[math]}$ )% in y
Linear-log	x is log transformed	An 1 unit increase of x will lead to an increase of ( $\text{[math]}$ /100) units in y
Log-linear	y is log transformed	An 1 unit increase in x will lead to an increase of (100*( $\text{[math]}$ ))% in y

Problem 1

You are tasked with interpreting the following MLR model.

Using what you know about how regression estimators are calculated, write a short summary describing the model.

Solution to Problem 1

When temperature and ad expenditure are both zero, there will still be 100 tickets sold. As the temperature increases by 1 degree, ticket sales increase by 1.3 tickets all other variables held constant. As ad expenditure increases by 100 pounds, ticket sales increase by 5.4 tickets.

Problem 2

Salary is a variable that typically has a right-skewed distribution. This is because most people tend to make around the same amount of money, with some people making extreme amounts of money. Because of this, a log transformation has been performed on salary. Interpret the following model.

Solution to Problem 2

The constant is not interpretable here, as salary and vacation days will never be zero. Every 1 day increase in vacation days increases the happiness score by 2 points. An 1 unit increase of salary will lead to an increase of 0.04 points in happiness score.

Did you like this article? Rate it!

4.00 (3 rating(s))

Emma

I am passionate about travelling and currently live and work in Paris. I like to spend my time reading, gardening, running, learning languages and exploring new places.

Solution to Problem of Regression 7

What is Regression?

Single Linear Regression

Multiple Linear Regression

MLR Estimators

MLR Interpretation

Variable Transformations

Interpretation with Transformed Variables

Problem 1

Solution to Problem 1

Problem 2

Solution to Problem 2

Theory

Central Limit Theorem

Linear Correlation Coefficient

Type I and Type II Errors

Hypothesis Testing

Linear Regression

Sampling

Solution to Problem of Regression 4

Solution to Problem of Regression 5

Solution to Problem of Regression 6

Solution to Problem of Regression 8

Solution to Problem of Regression 3

Solution to Problem of Regression 5

Solution to Problem of Regression 6

Solution to Problem of Regression 9

Solution to Problem of Regression 2

Solution to Problem of Regression 3

Solution to Problem of Regression 2

Solution to Problem of Regression 8

Two Variable Statistics

Solution to Problem of Regression 1