Coefficient of Variation

In previous sections, we showed you that the two major statistics in descriptive statistics are measures of central tendency and variability. You learned how to calculate both and tested this knowledge with problems and step-by-step solutions. Here, you will learn how to calculate another measure of variability: the coefficient of variation. 

What is the Coefficient of Variation?

While the discipline of statistics can help put data into order, dealing with data can be far from orderly. One of the most common types of analysis employed in fields like history, medicine and psychology, is called meta analysis. At its most basic, meta analysis is a review of a diverse range of studies that have been performed in the past for one subject.

The difficult thing about comparing different studies, or even two sets of data, is the fact that you will rarely get data that posses the same characteristics, such as the units measured, mean or sample size. In this case, you may be thinking comparing two or more data sets by way of standard deviation will solve the problem. Standard deviation measures the spread, after all.

However, the standard deviation is really only good at letting us compare values within the same data set. A more accurate way of comparing two or more data sets is to use the coefficient of variation.

The definition of the coefficient of variation is that it is the ratio between the standard deviation and the mean. The formula for the coefficient of variation is different for samples and a population, seen in the table below.

 

CV for the PopulationCV for a Sample

    \[ CV \thickspace = \frac{\sigma}{\mu}*100% \]

    \[ CV \thickspace = \frac{s}{\bar{x}}*100% \]

You may be thinking, did we just learn that standard deviation the best tool for comparing data sets? Yes, however the best part about the coefficient of variation is that it tells us information about the variability of the data regardless of units, sample size, and more. The standard deviation is the average variation between the data and the mean, whereas the coefficient of variation is the ratio between the standard deviation and the mean.
This means that we can report what proportion the standard deviation is of the mean. The coefficient of variation can be reported as a percentage. For example, if we have a standard deviation of 1.5 and a mean of 5, the ratio of the standard deviation to the mean is 0.3. In other words, the standard deviation is 30% of the mean. When comparing two data sets, the general rule of thumb you should follow is:

  • The higher the coefficient of variation, the higher the variability of the data set

This means that, when comparing two or more data sets, the one with the highest coefficient of variability can be said to have the highest variation.

Superprof

Coefficient of Variation versus Standard Deviation:

The easiest way to understand the difference between the standard of deviation and the coefficient of variation is to look at an example. In the table below, you’ll find two data sets on the amount of people that went to the cinema during a given period of time.

 

Data Set A:

Day of the WeekNumber of People
M200
T500
W300
Th1000
F400

 

Data Set B: 

Day of the WeekNumber of People
M100
T300
W400
Th1000
F1500
Sat500
Sun100

 

Using the formula for the standard deviation and mean, we get

 

Data Set A:

    \[s = 311.4\]

    \[ \bar{x} = 480\]

    \[ n \medspace (Sample \thickspace size) = 2400 \]

 

Data Set B:

    \[s = 515.9 \]

    \[ \bar{x} = 557.1 \]

    \[ n \medspace (Sample \thickspace size) = 3300 \]

 

Looking at the standard deviation, we would only understand the variability within the data set, which is quite high in both data sets. However, if we wanted to compare these two data sets, using the standard deviation would be risky. Data set A and B have different sample sizes and means. The studies also lasted for different amounts of time, where data set A holds weekday values while data set B holds values for the entire week.

Calculating the coefficient of variation for both data sets, we get:

 

Data Set A: 

    \[ \dfrac{311.4}{480} = 0.65 \]

 

Data Set B:

    \[ \dfrac{515.9}{557.1} = 0.93 \]

 

Now, we can see that data set b, because of the higher coefficient of variation, has a higher variability within its data set. We can see this even by looking at the data set itself, where there is a wide variation between each day.

 

Problem 1: Comparing Coefficients of Variation

Below, you will find the mean and standard deviation of several data sets. You’re interested in comparing each data set – however, each data set has a different mean, standard deviation and sample size. Find the coefficient of variation for each data set in the table below. Round to the nearest tenth.

MeasureData Set ABCD
Mean45605025
SD311515
Sample Size1 5003 2005002 700

Solution to Problem 1

In this problem, you were asked to:

  • Find the CV for each data set

In order to do this, we only need to plug the sample standard deviation and mean of each data set into the formula given above.

MeasureData Set ABCD
Coefficient of Variation

    \[ CV \thickspace = \dfrac{3}{45}*100% \]

CV = 6.7

    \[ CV \thickspace = \dfrac{11}{60}*100% \]

CV = 18.3

    \[ CV \thickspace = \dfrac{5}{50}*100% \]

CV = 10

    \[ CV \thickspace = \dfrac{15}{25}*100% \]

CV = 60

In this case, the data set with the lowest CV is data set A, followed by C, D and D. Meaning, set A has the lowest variation amongst these data sets.

Did you like the article?

1 Star2 Stars3 Stars4 Stars5 Stars (1 votes, average: 5.00 out of 5)
Loading...

Danica

Located in Prague and studying to become a Statistician, I enjoy reading, writing, and exploring new places.

Did you like
this resource?

Bravo!

Download it in pdf format by simply entering your e-mail!

{{ downloadEmailSaved }}

Your email is not valid

Leave a Reply

avatar
  Subscribe  
Notify of