Chapters

A Guide to Variance
What is Variance
How to Calculate Variance
Effect of Changing Units
How to Interpret Variance
Covariance
Variance Versus Other Measures of Variability

The best Maths tutors available

A Guide to Variance

In previous sections of this guide on descriptive statistics, you learned the fundamentals of variance. Specifically, we taught you what variance is, it’s important role in statistics and how to calculate it. Here, we’ll give a brief overview of all these things, as well as compare variance to other measures of variability.

What is Variance

Like all measures of variability, variance strives to capture the dispersion of a variable. Many people fall into the trap of associating variability with undesirability simply because, in the real world, variability is something we try to fix.

After all, it wouldn’t be too pleasant if your coffee would taste wildly different from the norm every time you bought it. While we’ll go over the specifics of interpretation later in this overview, it’s important to understand variance as a simple measure of dispersion as you dive into learning how to use it.

A basic definition of variance is that it captures how far spread data points are from their mean. A bigger number, or a high variance, suggests that the data are spread further from the centre point of the data set. This might be easier to grasp through an example. In the table below, you’ll find data on two different bags of marbles bought from a toy store.

Marble Colour	Bag 1	Bag 2
Blue	5	6
Red	7	4
Orange	8	15
Yellow	6	1

The variance of each bag is as follows. Don’t worry about the calculation, which we’ll show you in the next section. Here, focus on understanding what the variance is and why it is important in statistics.

	Bag 1	Bag 2
Variance	1.7	36.6

Here, the bigger variance means that there is a greater spread amongst the average number of marbles in the bag per colour. This can easily be seen by looking at the data set and noticing that the first bag has a more uniform amount of marbles per colour than the second one.

As you can imagine, variance is a concept with a wide range of application in statistics and beyond.

How to Calculate Variance

Now that we’ve shown you what variance is, we’ll now guide you through how to calculate it. As you know from our lessons on populations and samples, measures are calculated differently for parameters and statistics. As a brief recap, parameters are calculated from the population while statistics are calculated from samples.

In the table below, you’ll find the formulas for variance for the population and for a sample.

	Population	Sample
Variance Notation	\[ \sigma^2 \]	\[ s^2 \]
Variance Formula	\[ \frac{\Sigma(X-\mu)^2}{N} \]	\[ \frac{\Sigma(x_{i}-\bar{x})^2}{n-1} \]
Mean	\[ \mu = \frac{\Sigma(x_{i})}{N} \]	\[ \bar{x} = \frac{\Sigma(x_{i})}{n} \]

As you can see from the formula, you must first calculate the mean and subtract the mean from each individual observation in the data. Next, you sum those values and then divide them by the sample size minus one. This can sound arbitrary, so we’ll take the example above and break down how to calculate the variance step by step.

Calculation for Bag 1

1. Calculate the mean

\bar{x} =

\dfrac{(5+7+8+6)}{4}

\bar{x} = \dfrac{(26)}{4}

6.5

\bar{x} =

\dfrac{(6+4+15+1)}{4}

\bar{x} = \dfrac{(26)}{4}

6.5

2. Subtract the mean from every observation

x_{i}-\bar{x}

3. Square each subtracted value

(x_{i}-\bar{x})^2

4. Sum all squared values

\Sigma(x_{i}-\bar{x})^2 = 5

\Sigma(x_{i}-\bar{x})^2 = 109

5. Divide the sum by the $\text{[math]}$ minus 1

\dfrac{5}{4-1} = 1.7

\dfrac{109}{4-1} = 36.3

As we can see from the previous section, we’ve arrived at the same answer for each variance. While it’s unlikely you’ll have to calculate variance by hand with the number of computer software out there to complete the job for you, it’s helpful to understand the process behind the formula.

Effect of Changing Units

Like all measures of central tendency and variability, the issue of changing the units of the data can come up. Changing units simply means transforming the data points in your data set by performing common operations such as subtraction, addition, multiplication and division. More advanced transformations involve taking the logarithm or power of each data point.

There are many reasons why someone may want to transform their data. Some reasons include:

Transforming the data to fit a more convenient distribution
Wanting to display the data in more understandable units
Needing to change the data to convert it into a new variable

In the table below, you’ll find an example of each of the aforementioned scenarios.

Reason	Example
Transforming the data to fit a more convenient distribution	Changing the data to fit a normal distribution
Wanting to display the data in more understandable units	Receiving or measuring data in imperial units and needing units using the metric system
Needing to change the data to convert it into a new variable	Multiplying height and weight data to form a new variable of body mass index (BMI)

If you perform basic operations, such as addition, subtraction, multiplication and division, there are a couple of shortcuts to keep in mind if you merely want to know the measures of central tendency and variance. For addition and subtraction, the rules are recorded in the table below.

When adding or subtracting a constant	Effect on the Measure
Mean, Median, Mode	Add or subtract that constant
Standard Deviation, Variance, Average Deviation, IQR	No effect, they stay the same

For multiplication and division, these changes can be found in the table below.

When multiplying or dividing by a constant	Effect on the Measure
Mean, Median, Mode, Standard Deviation, Average Deviation, IQR	Multiply or divide by that constant
Variance	Multiply or divide by the square of that constant

Notice that when adding and subtracting a constant, measures of variability don’t change. On the other hand, when multiplying or dividing by a constant, the measures of variability do change, with the variance changing in a unique way. Let’s take the example used in the previous section and add 3 to every data point.

Marble Colour	Bag 1
Blue	5+3 = 8
Red	7+3 = 10
Orange	8+3 = 11
Yellow	6+3 = 9

To calculate the variance, we would normally follow the process of finding the mean, summing all the squared differences and dividing by the sample size minus 1.

\bar{x} = \dfrac{(8+10+11+9)}{4}

\bar{x} = \dfrac{(38)}{4}

9.5

\Sigma(x_{i}-\bar{x})^2 = 5

\dfrac{5}{4-1} = 1.7

As you can see, the variance didn’t change from the previous example. Instead of calculating each new observation by adding three and then calculating the new variance, using the rules we know we could have just stated that the variance had no change.

Using the rules above, we can measure the variance of the same data set if each data point were multiplied or divided by 3. Instead of performing these operations and calculating the variance again, we simply do the following.

Multiply by 3

Divide by 3

Multiply or divide by the square of that constant

$\text{[math]}$

1.7 * (3^2) = 15

$\text{[math]}$

\dfrac{1.7}{(3^2)} = 0.19

As you can imagine, remembering these rules can save you a lot of time. If you’re sceptical, calculate the new variances by hand and compare your answer to the ones above.

How to Interpret Variance

Interpreting the variance is all about context. While we might be tempted to generalize and say that big variances mean bigger spread, this rule only makes sense when we take a look at our data set.

For example, the number 1 000 might seem like a high number for a variance - however, if the mean is in the millions, it doesn’t seem so abnormal anymore. In this case, our variance would be pretty small and indicate that the values are spread closely around the mean.

Covariance

Covariance, unlike variance, tells us the joint variability of a pair of variables. In other words, it compares how one spread compares to another. This will give us a hint as to how variables are related and change with another.

For example, if we were to take the covariance of weight and height, we would most likely find that the variables change in the same direction with a high degree of relation. Meaning both that:

The higher the height, the higher the weight
Height has a strong relationship with weight

You can find the formulas for covariance below

Sample

Population

Covariance

Cov(X,Y) =

\frac{\Sigma(x_{i}-\bar{x})(y_{i}-\bar{y})}{n-1}

Cov(X,Y) =

\frac{\Sigma(x_{i}-\mu_{x})(y_{i}-\mu_{y})}{N}

Variance Versus Other Measures of Variability

Variance is most related to standard deviation. Each measure tells us something about how the data are spread but with slight differences, which are summarized below.

	Variance	Standard Deviation
Goal	Describes the variability of observations within a data set	Describes the spread around the centre point of the data set
Units	Squared units of the data set	The same units as the data set
Interpretation	How far spread the data units are from the mean	Tells us how typical values are given the mean

Did you like this article? Rate it!

4.00 (4 rating(s))

Emma

I am passionate about travelling and currently live and work in Paris. I like to spend my time reading, gardening, running, learning languages and exploring new places.

Formulas

Statistical Formulas

Descriptive Statistical Formulas

Can you help me answer my activities

Variance

A Guide to Variance

What is Variance

How to Calculate Variance

Effect of Changing Units

How to Interpret Variance

Covariance

Variance Versus Other Measures of Variability

Theory

Frequency Distribution

Solutions to Average Deviation, Variance and Standard Deviation Problems

Solutions to Quartiles, Deciles and Percentiles Problems

Solutions to Statistical Measures Problems

Solutions to Frequency Distribution Problems

Solutions to Discrete and Continuous Variable Problems

Solutions to Bar Chart Problems

Solutions to Mode, Median, Mean, Range, Average Deviation, Variance and Standard Deviation Problems

Solutions to Mean Problems

Solutions to Coefficient of Variation Problems

Solutions to Pie Chart and Mean Problems

Solutions to Median and Quartile Problems

Standard Deviation

Mean, Median and Mode Problem

Solutions to Categorical, Discrete and Continuous Variable Problems

Solutions to Histogram and Cumulative Frequency Polygon Problems

Coefficient of Variation

Standard Scores

Statistics

Variance

Solutions to Histogram, Mode and Median Problems

Solutions to Absolute Cumulative Frequency Distribution Problems

Solutions to Mean and Standard Deviation Problems

Bar Charts

Solutions to Categorical and Quantitative Variables Problems

Solutions to Variance and Standard Deviation Problems

Solutions to Mean, Median and Mode Problems

Solutions to Median, Mode, Mean and Quartiles Problems

Solutions to Histogram and Frequency Polygon Problems

Solutions to Frequency Polygon and Histogram Problems

Average Deviation

Standard Score Problem

Solutions to Standard Score Problems

Solutions to Mode, Median, Mean and Variance Problems

Solutions to Mean, Median and Mode Problems

Solutions to Mean and Variance Problems

Solutions to Mean, Median, Standard Deviation and Percentile Problems

Solutions to Statistical Table Problems

Deciles

Histograms

Quartiles

Statistical Variable

Solutions to Frequency Distribution and Bar Chart Problems

Regression line , PMCC – scientific calculator.

Formulas

Statistical Formulas

Descriptive Statistical Formulas

Exercises

Statistical Word Problems

Variance Problems

Statistics Problems

Cancel reply