Finding Measures of Central Tendency for Grouped Data

In previous sections introducing the concept of mean, median and mode, we discussed how descriptive statistics are generally divided between measures of central tendency and of variability. Here, we will expand upon what you learned about measures of central tendency by showing you how to calculate the mean, median and mode for grouped data.

Basic Measures of Central Tendency

Measures of central tendency are used to capture and describe the centre of a variable. They are employed when wanting to understand and illustrate the most “typical” value or values of a data set. In the table below, you’ll find a brief overview of these measures.

DefinitionFormulaCalculation
MeanThe average

    \[ \bar{x} = \]

    \[ \frac{\Sigma x_{i}}{n} \]

(on the left)1. Add all the observations

2. Divide that summed value by the sample size, n

MedianThe midpoint of the data, where half the observations fall above and belowNo standard formula1. Order the observations from least to greatest

2. Take the middle value

3. If the number of observations is even, take the average of the two middle values

ModeThe most occurring value of a variableNo standard formula1. Calculate the frequency of each value

2. The value with the highest frequency is the most occurring value, or the mode

Superprof

Grouped Data

So far, we’ve worked with data sets that list individual observations. This simply means that for each value, you can see each individual observation. For example, take the table below.

ObservationFrequency
601
613
622
635
647
652
661

Written out, this data would look like the following,

60, 61, 61, 61, 62, 62, 63, 63, 63, 63, 63, 64, 64, 64, 64, 64, 64, 64, 65, 65, 66

Here, finding the mean, median and mode is easy.

MeasureCalculation
Mean

    \[ = \frac{\Sigma}{n} = \dfrac{1326}{21} = 63.1 \]

MedianThe midpoint of the data is 63
ModeThe most occurring value in the data set is 64

However, data doesn’t always come packaged with the individual observations listed. In addition, sometimes you’re not necessarily interested in understanding individual data but rather groups within the data.

Grouped data is simply when observations are placed into groups, normally into intervals of some sort. Examples of grouped data include age groups, height groups, time groups and more. While categorical data can also be grouped, for example the frequency of each colour group in a paint store, grouped measures of central tendency make more intuitive sense when using only quantitative variables.

In the table below, you’ll find an example of grouped data for age groups.

Age GroupsFrequency
0 - 1040
10 - 2053
20 - 3058
30 - 4064
40 - 5072
50 - 6049
60 - 7036
70 - 8025

Keep in mind that in each group, only the first value, or the lower limit, of the interval is included up until the value before the upper limit, or the last value, of the group. Meaning, while 0 is included in the interval 0 through 10, 10 is not included. Instead, 9 is included, meaning there are 10 values in each group because we start counting at 0. Take a look at the image below for clarification.

Percentile

Grouped Mean

Finding the grouped mean is easy. Simply, follow the formula below.

    \[ x_{group} = \frac{\Sigma(f_{i}*x_{m})}{n} \]

The table below contains the explanation of the notation.

ElementDescription
x_{group}Group mean
f_{i}The frequency of the i^{th} observation
x_{m}The midpoint of the i^{th} x
nThe sample size

Using the example from above, we get the group mean performing the following steps.

Age GroupsFrequency

f_{i}

x_{m}f_{i} * x_{m}
0 - 1040

    \[ \dfrac{9+0}{2} = 4.5 \]

    \[ 40 * 4.5 = 180 \]

10 - 205314.5768.5
20 - 305824.51421
30 - 406434.52208
40 - 507244.53204
50 - 604954.52670.5
60 - 703664.52322
70 - 802574.51862.5
Total39714636.5

Plugging this into the formula, we get,

    \[ x_{group} = \frac{14636.5}{397} = 36.9 \]

We attain 36.9, meaning that the mean is somewhere between 30 and 40.

Grouped Median

Similarly, finding the median for grouped data requires a different process. To find the group mean, you must follow the formula below.

    \[ x_{med} = L + \frac{\frac{n}{2} - B}{G} * w \]

The table below contains the explanation of the notation.

ElementDescription
x_{med}Group median
LThe lower limit of the median group
nThe sample size
BThe cumulative frequency of all groups below the median group
GThe frequency of the group with the median
wThe width of the groups

Using the same example, we can see that the median of all the groups is roughly the middle point of the total frequency.

    \[ \dfrac{397}{2} = 198.5 \]

The 199th point occurs somewhere in the group 30 - 40 (in reality 30 to 39). We can do this estimation because the data are in order.

The cumulative frequency can be found in the table below.

Age GroupsFrequency

f_{i}

B
0 - 1040
10 - 2053

    \[ 40+53 = 93 \]

20 - 3058

    \[ 40+53+58 = 151 \]

30 - 4064215
40 - 5072287
50 - 6049336
60 - 7036372
70 - 8025397

From the previous calculations, we get the following values.

ElementValue
L30
n397
B151
G64
w10

Plugging these values into the formula, we get

    \[ x_{med} = 30 + \frac{\frac{397}{2} - 151}{64} * 10 = 37.4 \]

Our estimate of the median is about 37.

Grouped Mode

The formula for the mode of grouped data is as follows.

    \[ x_{mode} = L + \frac{ f_{m} - f_{m-1} }{ (f_{m} - f_{m-1}) + (f_{m} - f_{m+1}) } * w \]

The table below contains the explanation of the notation.

ElementDescription
x_{mode}Group mode
LThe lower limit of the group with the mode (the group with the highest frequency)
f_{m}Frequency of the group with the mode
f_{m-1}Frequency of the group before the one with the mode
f_{m+1}Frequency of the group after the one with the mode
wThe width of the groups

Using the same example, we get the following.

ElementValue
L40
f_{m}72
f_{m-1}64
f_{m+1}49
w10

Plugging this into the formula, we get

    \[ x_{mode} = 40 + \frac{ 72 - 64 }{ (72 - 64) + (72 - 49) } * 10 = 42.6 \]

Which gives us a mode of about 43.

 

Practice Problem 1

Find the group mean of the following data.

ScoresFrequency
1-205
21 - 4020
41 - 6047
61 - 8015
81 - 1003

Practice Problem 2

Find the group mode of the following data.

StudentsFrequency
1-3135
4-6457
7-9549
10-12392

Solution Problem 1

Follow the steps below to find the solution.

ScoresFrequencyx_{m}f_{i} * x_{m}
1-205

    \[ \dfrac{1+20}{2} = 10.5 \]

    \[ 5 * 10.5 = 52.5 \]

21 - 402030.5610
41 - 604750.52373.5
61 - 801570.51057.5
81 - 100390.5271.5
904365

    \[ x_{group} = \frac{4365}{90} = 48.5 \]

The group mean is between 41 and 60.

Solution Problem 2

The estimation of the mode can be found by following the steps below.

ElementValue
L7
f_{m}549
f_{m-1}457
f_{m+1}392
w3

    \[ x_{mode} = 7 + \frac{ 549 - 457 }{ (549 - 457) + (549 - 392) } * 3 = 8.1 \]

Meaning the estimation of the mode is 8.1.

 

Did you like the article?

1 Star2 Stars3 Stars4 Stars5 Stars (1 votes, average: 5.00 out of 5)
Loading...

Danica

Located in Prague and studying to become a Statistician, I enjoy reading, writing, and exploring new places.

Did you like
this resource?

Bravo!

Download it in pdf format by simply entering your e-mail!

{{ downloadEmailSaved }}

Your email is not valid

Leave a Reply

avatar
  Subscribe  
Notify of