Three Basic Types of Means

In previous sections, you learned the foundational elements of the mean and standard deviation of data sets. Specifically, we showed you how to calculate and interpret each. Here, we’ll go over the different types of means as well as how standard deviation is used for calculating confidence intervals. 

Types of Means

When people think of data sets, they tend to think of orderly data that follows common patterns such as a normal distribution. The reality, fortunately for us statisticians, is much more complex than that. As you learn about relationships between numbers in some higher-level maths courses, you start to learn that not every relationship is a linear relationship.

A linear relationship is one that follows an additive pattern between numbers. Additive simply means that you simply need to add or subtract a certain function from one number to arrive to the next one. This can be exemplified by a linear function such as,

 

    \[ 2x +3 \]

 

Let’s say we have this relationship for the following numbers:

ValueLinear Relationship
5(2*1)+3
7(2*2)+3
9(2*3)+3
11(2*4)+3
13(2*5)+3

 

While the most common way to graph this function would be on an axis, like below

 

Linear Relationship

 

We can also look at it on a linear plot, as exemplified below.

 

Line Graph Linear Relationship

 

However, not every data or number set has a linear relationship. This brings us to the three basic types of means, also known as “Pythagorean Means,” which are arithmetic mean, geometric mean, and harmonic mean. 

Superprof

Arithmetic Mean

The arithmetic mean, also known as AM, is the mean you’ve learned in previous sections of this guide. It is a simple average found by summing all observations and dividing by the number of observations, taking the form of the following formula

 

    \[ \frac{\Sigma x_{i}}{n} \]

 

Taking from our previous example, we would use the arithmetic mean to calculate the mean, which would lead us to the following result.

 

    \[ \dfrac{5+7+9+11+13}{5} \]

 

    \[ \dfrac{45}{5} \]

 

    \[ 9 \]

 

Taking a look back at our line plot, we can see that the mean corresponds to the exact midpoint of our data. This signals to us that using the AM here is appropriate because it is a good representation of the centre of our data.

Line Plot Arithmetic Mean

 

Geometric Mean

In the previous example, we found the mean for a relatively simple relationship. Many of your data sets will have variables that are linear, or additive. However, there are many instances where you will either receive variables or transform your variables into ones that have a multiplicative or exponential relationship.

For example, many people have to transform the variables in their data set in order to better interpret their data or to simply lend them a more normal distribution. In this case, it wouldn’t make sense to use the AM because it wouldn’t truly reflect the centre of the data. To illustrate this point, let’s take the previous example and multiply each number by 2 consecutively.

 

ValueMultiplicative Relationship
55*2
1010*2
2020*2
4040*2
80

 

Here, taking the AM would result in the following,

 

    \[ \dfrac{5+10+20+40+80}{5} \]

 

    \[ \dfrac{155}{5} \]

 

    \[ 31 \]

 

Plotting this on an axis, shown in the image below, we see that now we approach a graph with a curved line, also known as a geometric series.

 

Geometric Series

 

Plotting it in a line plot, we see that the mean is not a very good representation of the centre values of our data.

 

Line Plot Faulty Mean

 

In this instance, we would use the geometric mean, or GM, as a better representative of the centre because it is a geometric series. The formula for a geometric series is, 

    \[ \sqrt[n]{a_{1}*a_{2}*\dotsm*a_{n}} \]

 

Where we take each number in our data set and multiply them, then take the nth root of this multiplied value. The nth root indicates the sample size, or n, of our data set.

This means that our result would be ,

 

    \[ \sqrt[5]{5*10*20*40*80} \]

 

    \[ \sqrt[5]{3 \thickspace 200 \thickspace000} \]

 

    \[ 20 \]

 

Here, we can see that the geometric mean is actually equal to the median, which is the middle point of our data. Looking at the line plot below, we can see this is much closer to the centre point of our data set than the AM.

Line Plot Geometric Mean

 

Harmonic Mean

The harmonic mean is the third type of mean and probably the one you will either use the least or not at all. Also known as the HM, this mean uses reciprocals to calculate the relationship between fractions.

As a reminder, a reciprocal is basically 1 divided by a number, or  \dfrac{1}{n},  which basically equates to the numerator and denominator trading places. For example,

 

    \[ \dfrac{4}{1} = \dfrac{1}{4} \]

 

Or,

 

    \[ \dfrac{2}{3} = \dfrac{3}{2} \]

 

The formula for the HM is a bit complicated, as you can see below.

 

    \[ H = \frac{n}{\dfrac{1}{x_{1}} + \dfrac{1}{x_{2}} + \dotsm + \dfrac{a}{x_{n}}} \]

 

    \[ = \frac{n}{\Sigma_{i=1}^{n}{\dfrac{1}{x_{i}}}} \]

 

Don’t worry about not understanding the intricacies of this formula, chances are you’ll probably never need it. It’s helpful to see, however, and to know that there are three Pythagorean Means.

 

Problem 1: Choosing Which Mean to Use

As we’ve learned, when calculating the mean, there are generally three basic types of means from which to choose from. While we’ve guided you through scenarios that involve both the arithmetic and geometric mean, in real life you’ll rarely have someone telling you which one is the correct one to use.

Given the following data table, decide which mean would be more appropriate to use by describing the type of relationship there exists between the data points. Then, calculate it.

ObservationValue
13
27
320
455
5148

 

Solution to Problem 1

In this problem, we were tasked with:

  • Describing the relationship between the data points
  • Choosing which mean formula to use
  • Calculate the mean

As we can see from the values, there is no clear linear relationship between them, which suggests a multiplicative relationship. This becomes even more apparent when we graph the numbers.

 

Exponential Series

 

In fact, this relationship has a specific name, called an exponential relationship. This is because it grows at an exponential rate, as we can see form the graph. Therefore, using the geometric mean will be appropriate here. You should have followed steps similar to the ones below.

StepCalculation
1. Multiply all the values in the data set

    \[ 3*7*20*55*148 = 3 \thickspace 269 \thickspace 017 \]

2. Take the nth root of this multiplied number

    \[ \sqrt[5]{3 \thickspace 269 \thickspace 017} = 20 \]

 

Did you like the article?

1 Star2 Stars3 Stars4 Stars5 Stars (1 votes, average: 5.00 out of 5)
Loading...

Danica

Located in Prague and studying to become a Statistician, I enjoy reading, writing, and exploring new places.

Did you like
this resource?

Bravo!

Download it in pdf format by simply entering your e-mail!

{{ downloadEmailSaved }}

Your email is not valid

Leave a Reply

avatar
  Subscribe  
Notify of