The Interquartile Range
In the previous sections, you were introduced to quartiles and the interquartile range, otherwise known as the IQR. To briefly recap, the interquartile range is defined as the distance between the first and third quartiles, which contains both the median and 50% of the data. Recall the image below, used as an example illustration of the IQR.
While the IQR has many applications, including ones tied to the discussion on outliers explained further on in this section, what is important to note is how the measures of central tendency play into the IQR. This is easiest to see when looking at data plotted on a boxplot.
Boxplots can be an effective way of displaying the IQR because they can display many measures of central tendency and variability. The mean and median can be seen in both plots, where the boxplot on the left shows a boxplot where the mean is greater than the median and the boxplot on the right shows a distribution where the median and mean are equal.
The distribution, defined as how the variables are spread out, is best interpreted by the IQR. The boxplot on the left shows a boxplot where the first quartile is closer to the median than the third quartile. The boxplot on the right, on the other hand, shows a distribution where the median and mean are equidistant from both quartiles 1 and 3.
These differences in where the measures lie on the boxplot are due to differences in distributions. Where the distribution on the right is indicative of a normal distribution, the one on the left signals a skewed distribution. We’ll go more into more detail on distributions later. For now, you can find a recap of the measures of central tendency and variability you can observe from boxplots in the table below.
|Measure||Location on Boxplot||Interpretation|
|Mean||Typically located above or below the mean and within the IQR, although there are exceptions||The average of the data|
|Median||Located at quartile 2||Half the data fall above and below this point (the 50% mark)|
|Minimum||Located at Q0||The lowest value of the data set|
|Maximum||Located at Q4||The highest value of the data set|
|Interquartile Range||Between Q1 and Q3||Holds 50% of the data, the median and information about the centre 50% of the data set|