Chapters

A Guide to Statistics
What are Descriptive Statistics?
Population
Sample
Measures of Central Tendency
Measures of Variability
Notation of Measures of Central Tendency and Variability
Types of Variables
Data Visualization

The best Maths tutors available

A Guide to Statistics

In previous sections, you learned about the concepts involved in descriptive statistics. Specifically, we showed you the different measures involved in measures of central tendency and variability, as well as how to calculate each. In addition, we walked you through the types of variables involved in statistics as well as the types of analysis and visualizations you could make using data. Here, we’ll help you review everything related to descriptive statistics.

What are Descriptive Statistics?

The field of statistics is generally divided into two types of statistics: descriptive and inferential statistics. Descriptive statistics is, luckily, exactly what it sounds like: it involves analysing data on a descriptive basis. If this sounds confusing, let’s oppose it to inferential statistics in the table below.

Descriptive Statistics	Inferential Statistics
Makes statements about what is within the data	Makes predictions using of data points outside the data set by using the information within the data
Conveys information through measures like mean and standard deviation	Conveys information through predictive models
Visualizations generally include: Bar charts Pie charts Histograms Line graphs	Visualizations generally include: Line graphs Scatterplots

While this general information is by no means exhaustive, it can be a great starting point for understanding the differences between the two branches of statistics. The goal of descriptive statistics is to either summarize the characteristics of a data set or to analyse a data set by utilizing its descriptive properties.

Population

The units used in descriptive statistics can be anything. People using descriptive statistics can strive to measure things like:

Rainfall
Trees in parks
Tourists at a beach

The analysis that can be done using descriptive statistics alone isn’t just vastly diverse, it is also the majority of what many people use. The units that people strive to measure, however, need to be clearly defined in order to properly understand any data.

In statistics, the elements people want to study are split into a population and a sample. A population is the actual group of elements that you want to study. A population could be anything and take on any form. In the previous examples, the population would take the following form.

Elements	Population
Rainfall	Total rain produced
Trees in a park	All the trees in a park
Tourists at a beach	Total number of tourists at a beach

While this may seem simple, and it is, populations are notoriously hard to measure. While surveying the total number of trees in a park might be an easy task to accomplish if it involves a local city park, imagine the same task applied to a national forest. Often times, there is not enough financial resources or time to be able to measure an entire population. That is why in statistics you’ll often encounter samples.

Sample

A sample is a part of a population, where the elements and units might be the same. A sample is drawn from a population in order to make the data collection process cheaper and more time efficient. Taking the previous example, let’s take a look at the differences between a population and a sample.

Population	Sample
Total rain produced	Rainfall produced in an hour in one location of a city
All the trees in a park	Number of trees in measured in a one-kilometre radius
Total number of tourists at a beach	Number of tourists arriving at the beach at three specific times in a day

As you can guess, samples tend to include a fraction of the elements that are included in a population. There are many different methods for drawing a sample, which include:

Simple Random Sampling
Stratified Sampling
Cluster Sampling
Quota Sampling

As you can imagine, each sampling method has their advantages and disadvantages. The sampling method that is desired in most cases is simple random sampling, also known as SRS.

The reason is because it involves a completely random selection of elements from a population, which can decrease variability in the estimation of statistical measures. An SRS can be conducted with or without replacement.

Because the true population measure, or the measure we would have calculated had we measured the entire population, is unknown, measures calculated from samples are always considered as estimates of the population. A measure from a population is called a “parameter” while a measure from a sample is called a “statistic.”

Measures of Central Tendency

Measures of central tendency is a long name for something simple: measuring the centre. The reason why people like to measure the centre point of a data set is because it generally indicates what the most “typical” value of the data looks like.

There are three basic measures of central tendency: the mean, median and mode. Some rules of thumb for remembering when each of them is used are:

When the data includes extreme values or outliers, the median is better
When the data doesn’t include outliers and you want to measure the average, use the mean
When you want to know the value or category with the highest frequency, use the mode

Below are the formulas for each measure.

	Sample	Population
Mean	\[ \bar{x} = \frac{\Sigma x_{i}}{n} \]	\[ \mu = \frac{\Sigma x_{i}}{N} \]
Median	Midpoint of ordered data points, the average of the two midpoint values if it’s an even number of values	Calculated the same as the sample
Mode	The value or category with the highest frequency	Calculated the same as the sample

Measures of Variability

Unlike measures of central tendency, measures of variability strive to capture how the data are spread around the centre values. The two most basic types of variability measures include variance and standard deviation. Other common measures include:

Coefficient of Variation
Covariance
Standard Error

The spread of a data set is how closely or how far apart the data lie around the centre. While variance is used throughout statistics, standard deviation tends to be preferred when speaking to the spread of a data set because its units are easy to interpret.

Below you’ll find the formulas for standard deviation and variance for populations and samples.

Sample

Population

Variance

s^2 = \frac{\Sigma(x_{i}-\bar{x})^2}{n-1}

\sigma^2 = \frac{\Sigma(x_{i}-\mu)^2}{n}

Standard Deviation

s = \sqrt{ \frac{\Sigma(x_{i}-\bar{x})^2}{n-1} }

\sigma = \sqrt{ \frac{\Sigma(x_{i}-\mu)^2}{n} }

Notice that the standard deviation is simply the square root of the variance.

Notation of Measures of Central Tendency and Variability

As you may have noticed, the measures for the population and sample have different notations. These parameters are standardized throughout the statistical world. Meaning, you will encounter them everywhere from your textbooks to computer programs. Below, we’ve summarized the notations of the mean, standard deviation and variance.

	Sample	Population
Mean	\[ \bar{x} \]	\[ \mu \]
Standard Deviation	\[ s \]	\[ \sigma \]
Variance	\[ s^2 \]	\[ \sigma^2 \]

Types of Variables

There are many variable types, all used in different statistical analysis. The most common variable distinction is made between two variables: qualitative and quantitative variables, also known as categorical and numerical variables.

Qualitative variables are those that involve categories. They are called qualitative because they describe a variable’s characteristics, or qualities. These include variables like:

Colour
Shape
Gender

Quantitative variables, on the other hand, involve variables that measure quantities of something. These include variables like:

Height
Age
Weight

Quantitative and qualitative variables can be further broken down into sub-groups. Below you’ll find a summary.

Data
A collection of observations, measurements or ideas on specific variables
Quantitative	Qualitative
Numeric information about a place, person or thing	Descriptive information about a place, person or thing
	Ordinal	Nominal
	Ordered based on a specific scale	Not ordered on a scale

Data Visualization

Data visualization is an integral part of descriptive statistics and is defined by displaying information visually. The most common visualizations in descriptive statistics include:

Bar charts
Pie charts
Line graphs
Histograms

Did you like this article? Rate it!

4.00 (3 rating(s))

Emma

I am passionate about travelling and currently live and work in Paris. I like to spend my time reading, gardening, running, learning languages and exploring new places.

Formulas

Statistical Formulas

Descriptive Statistical Formulas

Can you help me answer my activities

Statistics

A Guide to Statistics

What are Descriptive Statistics?

Population

Sample

Measures of Central Tendency

Measures of Variability

Notation of Measures of Central Tendency and Variability

Types of Variables

Data Visualization

Theory

Frequency Distribution

Solutions to Average Deviation, Variance and Standard Deviation Problems

Solutions to Statistical Measures Problems

Solutions to Frequency Distribution Problems

Solutions to Discrete and Continuous Variable Problems

Solutions to Bar Chart Problems

Solutions to Mode, Median, Mean, Range, Average Deviation, Variance and Standard Deviation Problems

Solutions to Mean Problems

Solutions to Coefficient of Variation Problems

Solutions to Pie Chart and Mean Problems

Solutions to Median and Quartile Problems

Standard Deviation

Mean, Median and Mode Problem

Solutions to Categorical, Discrete and Continuous Variable Problems

Solutions to Histogram and Cumulative Frequency Polygon Problems

Coefficient of Variation

Standard Scores

Statistics

Variance

Solutions to Histogram, Mode and Median Problems

Solutions to Absolute Cumulative Frequency Distribution Problems

Solutions to Mean and Standard Deviation Problems

Bar Charts

Solutions to Categorical and Quantitative Variables Problems

Solutions to Variance and Standard Deviation Problems

Solutions to Mean, Median and Mode Problems

Solutions to Median, Mode, Mean and Quartiles Problems

Solutions to Histogram and Frequency Polygon Problems

Solutions to Frequency Polygon and Histogram Problems

Average Deviation

Standard Score Problem

Solutions to Standard Score Problems

Solutions to Mode, Median, Mean and Variance Problems

Solutions to Mean, Median and Mode Problems

Solutions to Mean and Variance Problems

Solutions to Mean, Median, Standard Deviation and Percentile Problems

Solutions to Statistical Table Problems

Deciles

Histograms

Quartiles

Statistical Variable

Solutions to Frequency Distribution and Bar Chart Problems

Regression line , PMCC – scientific calculator.

Solutions to Quartiles, Deciles and Percentiles Problems

Formulas

Statistical Formulas

Descriptive Statistical Formulas

Exercises

Statistical Word Problems

Variance Problems

Statistics Problems

Cancel reply