## How to Calculate Variance

Now that we’ve shown you what variance is, we’ll now guide you through how to calculate it. As you know from our lessons on populations and samples, measures are calculated differently for parameters and statistics. As a brief recap, parameters are calculated from the population while statistics are calculated from **samples**.

In the table below, you’ll find the formulas for variance for the population and for a sample.

| **Population** | **Sample** |

**Variance Notation** | | |

**Variance Formula** | | |

**Mean** | | |

As you can see from the formula, you must first calculate the mean and subtract the mean from each individual observation in the data. Next, you sum those values and then divide them by the sample size minus one. This can sound arbitrary, so we’ll take the example above and break down how to calculate the variance **step by step**.

| **Calculation for Bag 1** | **Calculation for Bag 1** |

1. Calculate the mean | | |

2. Subtract the mean from every observation | | |

3. Square each subtracted value | | |

4. Sum all squared values | | |

5. Divide the sum by the minus 1 | | |

As we can see from the previous section, we’ve arrived at the same answer for each variance. While it’s unlikely you’ll have to calculate variance by hand with the number of computer software out there to complete the job for you, it’s helpful to understand the process behind the formula.

## Effect of Changing Units

Like all measures of central tendency and variability, the issue of changing the units of the data can come up. Changing units simply means transforming the data points in your data set by performing common operations such as subtraction, addition, multiplication and division. More advanced transformations involve taking the logarithm or power of each data point.

There are many reasons why someone may want to** transform** their data. Some reasons include:

- Transforming the data to fit a more convenient distribution
- Wanting to display the data in more understandable units
- Needing to change the data to convert it into a new variable

In the table below, you’ll find an** example** of each of the aforementioned scenarios.

**Reason** | **Example** |

Transforming the data to fit a more convenient distribution | Changing the data to fit a normal distribution |

Wanting to display the data in more understandable units | Receiving or measuring data in imperial units and needing units using the metric system |

Needing to change the data to convert it into a new variable | Multiplying height and weight data to form a new variable of body mass index (BMI) |

If you perform basic operations, such as addition, subtraction, multiplication and division, there are a couple of shortcuts to keep in mind if you merely want to know the measures of central tendency and variance. For addition and subtraction, **the rules** are recorded in the table below.

**When adding or subtracting a constant ** | **Effect on the Measure** |

Mean, Median, Mode | Add or subtract that constant |

Standard Deviation, Variance, Average Deviation, IQR | No effect, they stay the same |

For multiplication and division, these changes can be found in the table below.

**When multiplying or dividing by a constant** | **Effect on the Measure** |

Mean, Median, Mode, Standard Deviation, Average Deviation, IQR | Multiply or divide by that constant |

Variance | Multiply or divide by the square of that constant |

Notice that when adding and subtracting a constant, measures of variability don’t change. On the other hand, when multiplying or dividing by a constant, the measures of variability do change, with the variance changing in a unique way. Let’s take the example used in the previous section and add 3 to every data point.

**Marble Colour** | **Bag 1** |

*Blue* | 5+3 = 8 |

*Red* | 7+3 = 10 |

*Orange* | 8+3 = 11 |

*Yellow* | 6+3 = 9 |

To calculate the variance, we would normally follow the process of finding the mean, **summing** all the squared differences and dividing by the sample size minus 1.

As you can see, the variance didn’t change from the previous example. Instead of calculating each new observation by adding three and then calculating the new variance, using **the rules** we know we could have just stated that the variance had no change.

Using the rules above, we can measure the variance of the same data set if each data point were multiplied or divided by 3. Instead of performing these operations and calculating the variance again, we simply do the following.

| **Multiply by 3** | **Divide by 3** |

Multiply or divide by the square of that constant | | |

As you can imagine, remembering these rules can save you a lot of time. If you’re sceptical, calculate the new variances by hand and compare your answer to the ones above.

## How to Interpret Variance

Interpreting the variance is all about context. While we might be** tempted to generalize** and say that big variances mean bigger spread, this rule only makes sense when we take a look at our data set.

For example, the number 1 000 might seem like a high number for a variance - however, if the mean is in the millions, it doesn’t seem so abnormal anymore. In this case, our variance would be pretty small and indicate that the values are spread closely around the mean.

## Covariance

Covariance, unlike variance, tells us the **joint variability** of a pair of variables. In other words, it compares how one spread compares to another. This will give us a hint as to how variables are related and change with another.

For example, if we were to take the covariance of weight and height, we would most likely find that the variables change in the same direction with a high degree of relation. Meaning both that:

- The
**higher** the height, the **higher** the weight - Height has a
** strong relationship** with weight

You can find the formulas for covariance below

| **Sample** | **Population** |

Covariance | | |

## Variance Versus Other Measures of Variability

Variance is most related to standard deviation. Each measure tells us something about how the data are spread but with slight differences, which are **summarized** below.

| **Variance** | **Standard Deviation** |

**Goal** | Describes the variability of observations within a data set | Describes the spread around the centre point of the data set |

**Units** | Squared units of the data set | The same units as the data set |

**Interpretation** | How far spread the data units are from the mean | Tells us how typical values are given the mean |

## Leave a Reply