**The mean, median and mode**

**Mean**

The arithmetic mean is commonly known as the average. It is calculated by the sum of the values in the sample group (Σ

*x*), divided by the number of values in the sample group (

*n*).

**Median**

The median is the middle value in the sample group, after they have been arranged in ascending order. In other words, it is the value in the middle of the ordered data. If there is an even number of values, the median is the average of the two middle values.

**Mode**

The mode is the value that appears most often. In other words, it is the value with the highest frequency.

How do we decide which measure to use? It is important to use the most appropriate measure of central tendency in order to make a sound conclusion. (You sometimes have to answer such a question in a test or exam.)

Measure | Advantages | Disadvantages |
---|---|---|

Average/Mean | ● Most common measure ● Used widely in media ● Easy to calculate ● All values in the sample group are used |
● If an extreme (very large or very small) value is added, the average changes drastically ● Cannot be used when the data value is simply a category |

Median | ● Easy to identify ifthe sample group is small | ● Time consuming if the sample if very big and not in ascending order |

Mode | ● No calculation needed ● Easy to deduce in grouped data ● is used to describe any kind of data |
● A small sample usually does not have a mode ● More than one mode is possible |

**Summarising Data**

If the data set is very large, it is useful to be able to summarise the data set by calculating a few quantities that give information about how the data values are spread and about the central values in the data set.

**Measures of Central Tendency**

**Mean or Average**

The mean, (also known as arithmetic mean), is simply the arithmetic average of a group of numbers (or data set) and is shown using the bar symbol’. So the mean of the variable

*x*is

*x̄*pronounced “

*x*-bar”. The mean of a set of values is calculated by adding up all the values in the set and dividing by the number of items in that set. The mean is calculated from the raw, ungrouped data.

**Definition: Mean**

The mean of a data set, *x*, denoted by *x̄*, is the average of the data values, and is calculated

**Method: Calculating the mean**

1. Find the total of the data values in the data set.

2. Count how many data values there are in the data set.

3. Divide the total by the number of data values.

**Worked Example 1: Mean**

Question: What is the mean of *x*={10,20,30,40,50}?

solution:

Step 1: Find the total of the data values

Step 2: Count the number of data values in the data set

There are 5 values in the data set.

Step 3: Divide the total by the number of data values.

Step 4: Answer

∴ the mean of the data set

*x*={10,20,30,40,50} is 30.

**Median
Definition: Median**

The median of a set of data is the data value in the central position, when the data set has been arranged from highest to lowest or from lowest to highest. There are an equal number of data values on either side of the median value.

The median is calculated from the raw, ungrouped data, as follows.

**Method: Calculating the median**

1. Order the data from smallest to largest or from largest to smallest.

2. Count how many data values there are in the data set.

3. Find the data value in the central position of the set.

**Worked Example 2: Median**

Question:

What is the median of {10,14,86,2,68,99,1}?

solution:

Step 1: Order the data set from lowest to highest

Step 2: Count the number of data values in the data set

There are 7 points in the data set.

Step 3: Find the central position of the data set

The central position of the data set is 4.

Step 4: Find the data value in the central position of the ordered data set.

14 is in the central position of the data set.

Step 5: Answer

∴ 14 is the median of the data set {l,2,10,l4,68,86,99}.

This example has highlighted a potential problem with determining the median. It is very easy to determine the median of a data set with an odd number of data values, but what happens when there is an even number of data values in the data set?

When there is an even number of data values, the median is the mean of the two middle points.

__Important__: Finding the Central Position of a Data Set

An easy way to determine the central position or positions for any ordered data set is to take the total number of data values, add 1, and then divide by 2. If the number you get is a whole number, then that is the central position. If the number you get is a fraction, take the two whole numbers on either side of the fraction, as the positions of the data values that must be averaged to obtain the median.

**Worked Example 3: Median**

Question: What is the median of {11,10,14,86,2,68,99,1}?

solution:

Step 1: Order the data set from lowest to highest

Step 2: Count the number of data values in the data set

There are 8 points in the data set.

Step 3: Find the central position of the data set

The central position of the data set is between positions 4 and 5.

Step 4: Find the data values around the central position of the ordered data set.

11 is in position 4 and 14 is in position 5.

Step 5: Answer

the median of the data set {1,2,10,11,14,68,85,99} is

**Mode
Definition: Mode**

The mode is the data value that occurs most often, i.e. it is the most frequent value or most common value in a set.

**Method: Calculating the mode**

Count how many times each data value occurs. The mode is the data value that occurs the most.

The mode is calculated from grouped data, or single data items.

Worked Example: Mode

Question: Find the mode of the data set *x*={1,2,3,4,4,4,5,6,7,8,8,9,10,10}

solution:

Step 1: Count how many times each data value occurs.

Step 2: Find the data value that occurs most often.

4 occurs most often.

Step 3: Answer

The mode of the data set

*x*={1, 2, 3, 4, 4, 4, 5, 6, 7, 8, 8, 9, 10, 10} is 4. Since the number 4 appears the most frequently.

A data set can have more than one mode. For example, both 2 and 3 are modes in the set 1, 2, 2, 3, 3. If all points in a data set occur with equal frequency, it is equally accurate to describe the data set as having many modes or no mode.

**More Questions and Solutions**

Q1. Calculate the mean, median and mode of the following data set

solution:

We first need to order the data set: {24; 28; 31; 31; 35; 41; 49}

Since there is an odd number of values in this data set the median lies at the fourth number: 31

The mode is the value that occurs the most. In this data set the mode is 31.

The mean, median and mode are: mean: 34.29; median: 31; mode: none.

Q2. The number of faulty products returned to an electrical goods store over a 21 day period is:

For this data set, find:

a) the mean b) the median c) the mode.

solution:

a)

faulty products

The ordered data set is:

∴ median =5 faulty products

c) 3 is the score which occurs the most often, so the mode is 3 faulty products.

For the faulty products data in Example 2, how are the measures of the middle affected if on the 22nd day the number of faulty products was 9?

We expect the mean to rise as the new data value is greater than the old mean.

In fact, the new mean

faulty products.

The new ordered data set would be:

The new median =(5+6)/2=5.5 faulty products.

This new data set has two modes. The modes are 3 and 9 faulty products and we say that the data set is bimodal.

Q3. The ages of 15 runners of the Comrades Marathon were recorded:

Calculate the mean, median and modal age.

Solution:

We first need to order the data set: {26; 28; 29; 31; 33; 33; 34; 38; 41; 42; 42; 45; 46; 51; 56}

Since there is an odd number of values in this data set the median lies at the eighth number: 38. The mode is the value that occurs the most. In this data set there are two modes: 33 and 42. Therefore the mean, median and modal ages are: mean: 38.3; median 38; mode 33 and 42.

Note:

● If a data set has three or more modes, we do not use the mode as a measure of the middle or centre of the data Values.

● Consider the data: 4 2 5 6 7 4 5 3 5 4 7 6 3 5 8 6 5.

The dot plot of this data is:

For this data the mean, median, and mode are all 5.

For a

*symmetrical distribution*of data, the mean, mode, and median will all be equal.

However, equal or approximately equal values of the mean, mode, and median does not necessarily indicate a distribution is symmetric.