In other sections of this guide on descriptive statistics, we explained the fundamental and intermediate aspects of frequency and its visualizations. This included finding absolute, row and column frequencies, as well as constructing charts such as a frequency polygon. Here, we’ll delve into more advanced topics in the construction of frequency polygons and histograms as well as provide some problems for you to practice your skills.
So far, we’ve discussed frequency and histograms in relation to a single variable. If you recall, this type of analysis is called univariate analysis because we are only investigating one variable. For example, say you have data on the weight of college students where the first four rows are presented below.
|Observation||Weight in Kg|
If we wanted to analyse the variable of weight, we could use measures of central tendency, variability, and charts like the histogram to try and make interpretations on the weight of college students. For example, we could calculate the mean weight of college students in our data set and how variable their weights are.
While this can be the entire analysis in itself, it’s very common that in statistics, univariate analysis is used in data sets with multiple variables in order to conduct an initial exploration of that data. This is called exploratory analysis because it is performed in order to understand what the data actually contains or looks like.
Descriptive statistics don’t just deal with univariate and exploratory analysis, however. The table below gives a quick summary of the types of analysis you can perform in statistics.
|Type||Exploratory Analysis||Univariate Analysis||Bivariate Analysis||Multivariate Analysis|
|Definition||When you study the characteristics of one or more variables in order to understand their characteristics.||When you study one variable, you are performing a univariate analysis.||When you study two variables and the relationship between them.||Studying two or more variables and the relationships between them.|
|Example||Calculating the mean or identifying a skew in the variable of weight.||Analysing the weight of college students.||Analysing the weight of college students and another quantitative or qualitative variable, such as age or sex.||Analysing the weight of college students and, for example, age and sex.|
Continuing with our example, let’s say we don’t just have data on the weight of college students but also, as mentioned in the table, information on age and sex. As you’ve seen before, instead of presenting all the rows of our data set, which can be hundreds or thousands of data points long, we can present our data as grouped data as is done in the table below.
|Weight Group in Kg||Frequency|
As we can see, instead of presenting data on 10,000 observations, we can condense the data into 9 different groups for which we give the group frequency. If we were to plot this data, it would look like the following.
Using what we know about the interpretation of histograms, we can see that the histogram suggests there are two centres, illustrated by the two peaks known as modes. While many of the histograms we’ve discussed are unimodal, meaning they have one centre, this bimodal distribution suggests that there are two groups with different centres. This is where it can be helpful to move from a univariate into a bivariate or multivariate analysis.