In other sections of this guide on descriptive statistics, we taught you the basics of finding mean, median, standard deviation and percentiles. We provided you the formulas to each, as well as introduced some intermediate topics involving these measures of central tendency and variability. Here, we’ll present some advanced topics relating to these measures, namely the standard error and percentile rank.

The best Maths tutors available

Measures of Central Tendency and Variability

While measures like mean, median, standard deviation and percentiles can seem somewhat basic, the truth is that they are some of the most powerful tools of analysis within descriptive statistics and which form many of the fundamental concepts in more advanced, inferential statistics.

Percentiles, for example, is a concept that is used in quantile regression - which strives to identify patterns within different quantiles of a data set. These advanced applications rely on a strong, fundamental understanding of these more basic concepts. Below, you’ll find a brief recap of the formulas for sample mean, median and standard deviation as well as when they’re used and how to interpret them.

	Formula	Uses	Interpretation
Mean	\[ \bar{x} = \frac{\Sigma x}{n} \]	Weighted mean Good representation of most typical value in data sets without extreme outliers	The average of the data set, capturing where most values are cantered
Median	No standard formula, is found by identifying the middle value of ordered data	Interquartile Range Good representation of the middle value when there’s outliers	The midpoint of the data, representing the point where half the data falls above and below
SD	\[ \sqrt{ \frac{\Sigma (x_{i} - \bar{x})^2 }{n-1} } \]	Z-scores Variety of statistical tests on data	Is used to determine how typical a value is for a data set with a given mean and standard deviation

Standard Error

The concept of the standard deviation is a great starting point for understanding the standard error. It is defined as the estimate of the standard deviation of an estimate. Thus far, you’ve been calculating the standard deviation of an entire sample. Meaning, you’ve calculated a measure by which you can identify how likely any value would appear given the mean and sample size of your data set.

The standard deviation typically goes hand in hand with the distribution of a variable. Every distribution you will encounter reflects a probability density function, where each point on the horizontal axis corresponds with the probability of it occurring in a given data set on the vertical axis.

The question the standard deviation of a variable or data set attempts to answer is whether a given value is likely to be found for a given distribution. The question the standard error tries to answer is whether how likely a given statistic, such as the mean, is to be found for a given data set.

Instead of trying to find the distribution of a data set, the standard error tries to find the distribution of a given statistic. Recall the difference between a sample and a population. Measures of a population are called parameters, where the standard deviation of a parameter gives us information about the distribution of that parameter.

Measures of a sample, however, are estimates of parameters and are called statistics. The standard error, then, is an estimate of the standard deviation of a statistic.

If this is confusing, take the data sets below as an example. Each measure the test scores of students at different schools for the same test.

Observation	School A	School B	School C
1	45	29	66
2	64	54	57
3	38	59	58
4	70	67	52
5	62	68	32
Mean	56	55	53

Taking these three sample means, we can make an informed guess that the mean test score for the region is around 55 points. If we had information for ten more schools, then 100 more schools, and finally for all the schools in the region - we could start to get closer and closer to some middle value that may be higher or lower than 55 points.

Because we don’t always have the chance to take infinitely many samples, a probability distribution of means is a great way of estimating how likely a given mean is for a particular data set. In the table below, you’ll find the formulas for the standard deviation and the standard error of the mean for a population and sample, respectively.

Population

Sample

\sigma_{\bar{x}} = \frac{\sigma}{\sqrt{n}}

SE = \frac{s}{\sqrt{n}}

Problem 1

Given the following dataset, what can you say about the accuracy of the average of the data?

Observation	Value
1	45
2	67
3	28
4	32
5	29
6	46
7	61
8	58
9	49
10	36
11	34

Problem 2

Given the following information, determine which data set has a lower standard error of mean.

Observation	School A	School B	School C
Mean	56	55	53
SD	4	10	8
Sample Size	9	1 000	250

Problem 3

Given the following chart, determine one of the measures of central tendency.

Problem 4

Given the following chart, choose which measure of central tendency would be most appropriate.

Solution Problem 1

In this problem, you were asked to:

Say something about the accuracy of the mean

In this case, we first find the mean by following the steps below.

Observation	Value
1	45
2	67
3	28
4	32
5	29
6	46
7	61
8	58
9	49
10	36
11	34
Total	485

The mean is calculated as

\bar{x} = \dfrac{485}{11} = 44.1

To find the accuracy of the mean, we need to calculate the standard error of the mean by first finding the standard deviation.

s = \sqrt{

\dfrac{1832.91}{(11-1)} = 13.5

}

Then, we plug the SD into the formula for standard error.

SE = \frac{s}{\sqrt{n}} = \dfrac{13.5}{\sqrt{11}} = 4.1

Because the standard error is relatively small when compared to the dataset, this suggests the mean is pretty accurate.

Solution Problem 2

You were asked to determine which data set has a lower standard error of mean. Using the information given, you can calculate the standard error for each sample.

Observation

School A

School B

School C

= \dfrac{4}{\sqrt{9}}

= 1.33

=\dfrac{10}{\sqrt{1000}}

= 0.32

= \dfrac{8}{\sqrt{250}}

= 0.51

Based on the calculations performed in the table above, the sample with the lowest standard error of the mean is School B.

Solution Problem 3

The only measure of central tendency we can get from this chart is the mode, which is Chemistry.

Solution Problem 4

Because there are extreme values between 107 and 119, either the mode or the median would be most appropriate depending on what we’d like to investigate.

Summarise with AI:

Did you like this article? Rate it!

4.00 (5 rating(s))

Emma

I am passionate about travelling and currently live and work in Paris. I like to spend my time reading, gardening, running, learning languages and exploring new places.

Formulas

Statistical Formulas

Descriptive Statistical Formulas

Can you help me answer my activities

Summarise with AI:

Chapters

Measures of Central Tendency and Variability
Standard Error
Problem 1
Problem 2
Problem 3
Problem 4
Solution Problem 1
Solution Problem 2
Solution Problem 3
Solution Problem 4

Solutions to Mean, Median, Standard Deviation and Percentile Problems

Measures of Central Tendency and Variability

Standard Error

Problem 1

Problem 2

Problem 3

Problem 4

Solution Problem 1

Solution Problem 2

Solution Problem 3

Solution Problem 4

Theory

Frequency Distribution

Solutions to Average Deviation, Variance and Standard Deviation Problems

Solutions to Statistical Measures Problems

Solutions to Frequency Distribution Problems

Solutions to Discrete and Continuous Variable Problems

Solutions to Bar Chart Problems

Solutions to Mode, Median, Mean, Range, Average Deviation, Variance and Standard Deviation Problems

Solutions to Mean Problems

Solutions to Coefficient of Variation Problems

Solutions to Pie Chart and Mean Problems

Solutions to Median and Quartile Problems

Mean, Median and Mode Problem

Solutions to Categorical, Discrete and Continuous Variable Problems

Solutions to Histogram and Cumulative Frequency Polygon Problems

Coefficient of Variation

Standard Scores

Statistics

Variance

Solutions to Histogram, Mode and Median Problems

Solutions to Absolute Cumulative Frequency Distribution Problems

Solutions to Mean and Standard Deviation Problems

Bar Charts

Solutions to Categorical and Quantitative Variables Problems

Solutions to Variance and Standard Deviation Problems

Solutions to Mean, Median and Mode Problems

Solutions to Median, Mode, Mean and Quartiles Problems

Solutions to Histogram and Frequency Polygon Problems

Solutions to Frequency Polygon and Histogram Problems

Average Deviation

Standard Score Problem

Solutions to Standard Score Problems

Solutions to Mode, Median, Mean and Variance Problems

Solutions to Mean, Median and Mode Problems

Solutions to Mean and Variance Problems