Name: Basic Statistical Methods and Concepts
Brand: Superprof
SKU: SP-CB-00047665
Rating: 5 (1 reviews)

Let’s face it, while data science was named the “sexiest job of the 21^st century,” the majority of people still shudder at even the mention of statistics. The root of why this discipline has been so alienating throughout the course of its history can be found with its close relationship with mathematics.

Whether you believe you can’t learn statistical analysis or are simply curious to learn more about it, this guide will help get you started by laying out the core introductory concepts.

At the heart of statistics are the five essential concepts of statistics, and form the basis for data analysis. The first four can be dealt with without going into much detail about their equations:

Mean: the average value, calculated as the sum of all observations over the number of observations
Median: the midpoint of the dataset, calculated by ordering all observations from least to greatest and taking the value directly in the middle
Variance: the general spread of the data, calculated as the average of squared differences of the mean
Standard Deviation: also a measure of spread, calculated by taking the square root of the variance

Basic statistical methods — Compute statistical data easily | Photo by Jorge Franganillo

Much like witnesses in a detective novel, these four concepts start to tell you the story of a particular set of data because they are descriptive statistics. For example, if you look around at the people in any restaurant you find yourself in, it can be very difficult to build a narrative, or interpretation, about the kind of crowd you’re surrounded by based solely on appearance.

Say, however, you are given information about their age, monthly income, level of education, gender, and taste of music. The first two concepts, the mean and the median, are both measures of central tendency that can tell you whether your crowd is mostly twenty-somethings making their way through college or wealthy, elderly people that invest in hedge funds.

The difference between when you use these concepts depends on the distribution of the variable that you’re measuring or, in this example, the amount of variability within the crowd. The more alike the crowd is, the more accurate taking the mean will be in telling your story; the more variation between the people are, the more accurate the picture you draw will be by taking the mean.

The variance and standard deviation are both measures of variability and can tell you how different each observation in your data are from the average with regards to a specific variable.

If you wanted to see how similar the crowd is in terms of age, you would start the computation by calculating the mean age and, by subtracting every individual’s age from it, find a number that tells you how far people are spread from the average. The standard deviation, on the other hand, gives you how far or close your data is clustered around the mean based on a normal distribution.

The standard deviation is exactly like the variance in terms of what it says about the spread of your data – in fact, the standard deviation is calculated by taking the square root of the variance. The difference lies in the fact that the standard deviation the descriptive measure that is easiest to report because it is in the same units as the original data, whereas the variance is not.

You can test what you've learned in your statistics course so far by attempting some statistics practice problems online!

The best Data Analysis tutors available

Type of Test	Type of Variables	Example
Pearson Correlation	Two continuous variables	If shoe size has an association with height
Spearman Correlation	Two ordinal variables	How strong of an association there is between happiness and economic status
Chi-Square	Two categorical variables	To see whether gender and favorite color have any association

Type of Test	Type of Variables	Example
Paired T-Test	Two related variables	The difference between weight before and after taking new supplement
Independent T-Test	Two independent variables	The difference in spending on gas between people Los Angeles and New York
One-Way Analysis of Variance (ANOVA)	One independent variable with distinct levels and one continuous variable	Comparing the means of test scores from three different levels of education
Two-Way ANOVA	Two or more independent variables with distinct levels and one continuous variable	Comparing the means of test scores from both three levels of education and twelve different zodiac signs

Type of Test	Type of Variable	Example
Simple Linear Regression	One scale variable (dependent) with one or two scale variables (predictors)	You want to see if and how well age and height predict weight
Multiple Linear Regression	One scale variable (dependent) with two or more scale variables (predictors)	You want to see if and how well age, height, and income predict weight

Type of Test	Type of Variable	Example
Wilcoxon Rank-Sum Test	Two independent variables	Between two different drugs, which one offers the best relief on two random, distinct groups of a population
Wilcoxon Sign-Rank Test	Two related variables	Between two different drugs, which one offers the best relief on the same group of patients
Friedman Test	Three metric or ordinal variables (has to be either metric or ordinal)	Three different ad ratings given by individuals in the same population

Assumption	Description
Independence	The groups that make up the sample are independent of eachother.
Normality	The data in the set is are normal, meaning that there it follows a normal distribution.
Homogeneity of variance	If there are multiple groups in the data relating to your independent variable, they have the same variance.

Learn Everything from Probability to Wilcoxon Tests

What is Probability?

How to Choose a Statistical Test

When to Use Tests of Association

Tests of Comparison between Means

Tests of Prediction using Linear Regression

Tests for Nonparametric Data

How to Perform Statistical Tests

Cancel reply