Chapters

## What are Inferential Statistics?

Many people tend to think of statistics as one unattractive, mathematical blob. In fact, there are many branches under the umbrella of statistics, including probability, machine learning, and more. These branches all fall under one of two categories, illustrated below. The main differences between inferential and descriptive statistics are summarized in the table below.

 Inferential Descriptive Definition Statistical analysis that predicts the future using the dataset Statistical analysis that illustrates or measures data included in the dataset Measures -Regression -Hypothesis tests -Central tendency (mean, mode) -Spread (standard deviation, variance) Variable Types Numeric and Categorical Numerical and Categorical Example Regression analysis between test score and hours spent studying Calculating the mean test score for different schools The best Maths tutors available
1st lesson free!  5 (27 reviews)
Ayush
£90
/h
1st lesson free!  4.9 (23 reviews)
Intasar
£42
/h
1st lesson free!  5 (17 reviews)
Matthew
£25
/h
1st lesson free!  4.9 (6 reviews)
Dr. Kritaphat
£39
/h
1st lesson free!  4.9 (11 reviews)
Paolo
£25
/h
1st lesson free!  4.9 (9 reviews)
Petar
£27
/h
1st lesson free!  5 (15 reviews)
Myriam
£20
/h
1st lesson free!  5 (12 reviews)
Andrea
£40
/h
1st lesson free!  5 (27 reviews)
Ayush
£90
/h
1st lesson free!  4.9 (23 reviews)
Intasar
£42
/h
1st lesson free!  5 (17 reviews)
Matthew
£25
/h
1st lesson free!  4.9 (6 reviews)
Dr. Kritaphat
£39
/h
1st lesson free!  4.9 (11 reviews)
Paolo
£25
/h
1st lesson free!  4.9 (9 reviews)
Petar
£27
/h
1st lesson free!  5 (15 reviews)
Myriam
£20
/h
1st lesson free!  5 (12 reviews)
Andrea
£40
/h

## Regression Definition

You have probably heard of regression in many different contexts. This is because regression analysis is one of the most widely used tools of inferential statistics. Regression analysis is defined as the process of measuring the relationship between two or more variables. The image above is a graph containing the two types of variables in regression analysis: independent and dependent variables. Notice that there is a pattern between these two variables. This pattern can be captured with a regression model, which models the linear relationship between two variables.

 Independent Variable Dependent Variable Definition The variable that we use to predict our dependent variable The variable that responds to the independent variable Type Numeric or categorical (known as a ‘dummy’ variable) Numerical, can only be categorical when using a special type of regression called logistic regression Other Names Explanatory variable Response variable

## Simple Regression Formula

As mentioned, linear regression can be used to model the relationship between two or more variables. When a linear regression involves only one independent and one dependent variable, this is known as simple linear regression, or SLR. The graph above is the same as the one before, with the only difference being the line running through the observed data points. This line is known as a regression line. The regression line is calculated based off of the following formula. The reason why there are two formulas has to do with the fact that one is the formula for the population while the other is a formula for the sample. Recall that a population contains all the things we want to study, which means that we rarely have access to all the data from the population. The sample, on the other hand, is a subvert of the population. With the sample, we can find an estimation of the true population regression model.

 Population Sample Response Variable The population dependent variable The sample dependent variable Explanatory Variable The population explanatory variable The sample explanatory variable Constant The value of the population dependent variable when all independent variables are zero The value of the sample  dependent variable when all independent variables are zero Regression coefficient The population parameters The sample estimates of the population parameters Error The part of y not explained by x Is assumed to be zero

## SLR Estimate Formulas

Many SLR models are run using some program or software. Meaning, programs such as R or Python take the data in your model and run the regression model automatically, calculating all regression coefficients and statistics. Many people, when learning statistics, start by calculating regression estimates by hand. In the image above, you can see that there are two parameters that we estimate using SLR. The first is the y-intercept, which is the value of y when all x’s are zero. The formula can be seen below. The following table describes each element in the formula

 Element Description  Mean of y  The regression coefficient  Mean of x

As you can see, we need to first calculate the sample regression coefficient before calculating the intercept. Below, you can find the formula for . The following table contains the explanation for the formula.

 Element Description  The ith observation of x  The mean of x  The ith observation of y  The mean of y

In order to find the full regression model, all you need to do is simply plug the calculated constant and regression coefficient into the model. Take the following scenario as an example.

 Element Description y Shoe price  30  4.5 x Number of customizations

In the above example, the slope and regression coefficient have already been calculated. The SLR model would therefore look like this: ## Problem 1

In this section you learned about the differences between descriptive and inferential statistics. You are interested in understanding the differences between what analysis you can do on a data set. You are given the data set below, which comes from a restaurant on the beach. This restaurant is interested in knowing what the relationship is between the number of soups sold and the weather. Classify the types of analysis you can do on this data set based on the differences between inferential and descriptive statistics.

 Soup Sales Temperature 24 2 15 10 8 17 5 27

## Solution to Problem 1

In this problem, you were asked to:

• Understand the differences between the two branches of statistics
• Write down some analysis you can do based on these two branches

The first step in solving this problem is knowing what the main differences are between inferential and descriptive statistics. First, descriptive statistics uses the information within the data set in order to describe what the data looks like. On the other hand, inferential statistics uses the data set to try to make inferences about data points outside of its range.

Next, we can classify the different analysis in the table below.

 Inferential Descriptive Simple linear regression Measures of central tendency: mean, median, mode Hypothesis testing Measures of spread: variance, standard deviation, range Modelling Descriptive visualizations: pie chart, bar chart, etc.

## Problem 2

In the previous example you were asked to describe the types of analysis you could conduct based on the two types of statistics. Next, using the same data, you are asked to conduct a regression analysis. Build a simple linear regression model based on the formulas provided. Next, describe how this model would look on the following chart. ## Solution to Problem 2

In this problem, you were asked to build a regression model. First, you need to calculate the mean. Next, subtract the mean from all observations in your data set and

 Temperature Soup Sales        2 24 -12 11 144 -132 10 15 -4 2 16 -8 17 8 3 -5 9 -15 27 5 13 -8 169 -104 Mean = 14 Mean = 13 Total 338 -259

Next, we plug it into the equations for and :  Finally, we get the following regression: This model would be a line on the graph above.

Need a Maths teacher?

Did you like the article?     3.00/5 - 2 vote(s) Loading...

Danica

Located in Prague and studying to become a Statistician, I enjoy reading, writing, and exploring new places.