**STATISTICS**

**Overview**

Statistics is the branch of Mathematics that is used daily over a broad spectrum of common everyday life to gather and interpret information. This information or data enables us to predict the future, to make choices and to improve existing frameworks.

**Measures of Central Tendency**

Generally, average value of a distribution in the middle part of the distribution, such type of values are known as measures of central tendency. The following are the five measures of central tendency

1. Arithmetic Mean

2. Median

3. Mode

**Measures of central tendency ungrouped data**

Measures of central tendency are different measures of finding the ‘middle’ or ‘average’ of a set of data. The three kinds of ‘middle’ of a set of data that we use are the mean, the median and the mode.

It is advisable that we start by arranging the set of data in an ascending order before attempting questions.

There are mainly two different types of numerical which could be collected, namely

discrete and continuous data.

● Discrete data is data with specific values normally gathered by means of counting, e.g. we are three children in our family.

● Continuous data is data that can occupy any value between two points and is usually obtained through measurement, e.g. I am 1,76 m tall.

__the Big Cricket Example__

Andrew Flintoff and Michael Vaughn are having an argument in the pub trying to decide who has had the better season with the cricket bat. Here are their scores:

Use your knowledge of Averages and Measures of Spread to decide which cricketer has had the better season.

NOTE: the first point to notice is that by just looking at the scores as they are makes it hard to come to a decision about who has had the better season… That’s why we need Statistics!

Secondly, just adding up the total amount of runs and deciding that way would not be fair. Why?… well, because they have not played the same number of games!

So, there is only one thing for it… let’s work out some statistics!

**1. The Mean**

1. Add up all your data values

2. Divide this total by the number of data values

Andrew Flintoff

total runs scored: 551

total games played: 15

mean: 36.7 runs

Michael Vaughn

total runs scored: 651

total games played: 14

mean: 46.5 runs

What does this tell us? – well, it looks like, on average, Michael Vaughn has had the better season.

Good thing about the mean – notice how every single score was used to calculate the mean – this means it gives a good summary of the whole season.

__Bad thing about the mean__ – look at Michael vaughn’s scores. He only had two decent ones, and yet his mean is far higher than Andrew Flintoff’s! This is because __the mean is significantly affected by outliers__ – pieces of data which stand out for being really low or really high like the two scores of 370 and 250. You could argue that these have __distorted the result__!

**2. The Median**

1. Place allyour data values in ascending order (biggest to smallest)

2. The piece of data in the middle is your median

Median =32 runs

Note: there are the same number of data (7 pieces) either side of the box!

Median =1.5 runs

Note: there are still the same number of data (6 pieces) either side of the box!

What does this tell us? – well, this time it looks like __Andrew Flintoff__ has had the better season.

Good thing about the median – because we are only focussing on pieces of data in the middle, __outliers don’t have as big an effect__, so they cannot distort the results!

Bad thing about the median – the problem here is that you are only looking at- at most- a couple of pieces of data from each player. You could argue that __the result is not representative__ as a lot of pieces of data (scores) arejust ignored!

**3. The Mode**

1. Find the most common piece of data (number or letter) and this is your mode!

What does this tell us? – well, using the mode it again looks like

__Andrew Flintoff__is on top!

Good thing about the mode – very speedy to work out!

Bad thing about the mode – can give distorted, or even no results. Imagine if Andrew Flintoff only scored one innings of 32 runs… he would have no mode to compare! Or, imagine if he instead scored a couple of innings of 200… The mode would then say this was his average!

**4. The Range**

1. Subtract the smallest data value away from the biggest data value!

What does this tell us? – well, one answer that I often here is this: “Michael Vaughn has the biggest range, so he is the best!”… but that’s not quite right.

The bigger the range, the more spread out your scores are… so the less consistent (brilliant maths word that always impresses examiners/teacher) your performance is.

So, I would argue, because Andrew Flintoff has a

__smaller range__, his performance is more consistent, and therefore he has had the better season!

Good thing about the range – gives a very guick measure of how spread out the data is.

Bad thing about the range – unfortunately, this statistic is

__vulnerable to outliers as well__! Michael Vaughn had a couple of big scores, and look at the effect it had on his range!… This is why mathematicians prefer to measure the spread of data using the Inter-quartile Range or Standard Deviation… but don’t worry about them yet!

So who is the better cricketer?…

Well, in the end, it’s up to you! the most important thing is that you have shown you can calculate each of the statistics and – not a lot of people can do this – interpret what they mean!

If you want my opinion, as a proud Lancastrian, Andrew Flintoff is much better!

**Example**

How long is the average name?

The number of letters in the first names of 26 learners was recorded as follows:

Arithmetic mean/average:

Median: First arrange in ascending order:

There is an even number of values, therefore:

Mode: 4 occurs most often and in thus the mode.

📌 Example 1.

A boy scored following marks in various class tests during a term; each test

being marked out of 20.

15, 17, 16, 7, 10, 12, 14, 16, 19, 12 and 16

(i) What are his modal marks?

(ii) What are his median marks?

(iii) What are his total marks?

(iv) What are his mean marks?

✍ Solution:

Arranging the given data in ascending order:

7, 10, 12, 12, 14, 15, 16, 16, 16, 17, 19

(i) Mode =16 as it occurs maximum number of times.

(ii) Median =(11+1)/2=6th term =15

(iii)Total marks =7+10+12+12+14+15+16+16+16+17+19=154

(iv) Mean

📌 Ex2. Calculate the mean, median and mode of the following data sets:

a) {2; 5; 8; 8; 11; 13; 22; 23; 27}

solution:

The data set is already ordered.

Since there is an odd number of values in this data set the median lies at the fifth number: 11 The mode is the value that occurs the most. In this data set the mode is 8.

The mean, median and mode are: mean: 13.2; median: 11; mode: 8.

b) {15; 17; 24; 24; 26; 28; 31; 43}

✍ Solution:

The data set is already ordered.

Since there is an even number of values in this data set the median lies between the fourth and fifth numbers:

The mode is the value that occurs the most. In this data set the mode is 24. The mean, median and mode are: mean: 26; median: 25; mode: 24.

c) {4; 11; 3; 15; 11; 13; 25; 17; 2; 11}

solution:

We first need to order the data set: {2; 3; 4; 11; 11; 11; 13; 15; 17; 25}.

Since there is an even number of values in this data set the median lies between the fifth and sixth numbers:

The mode is the value that occurs the most. In this data set the mode is 11.

Therefore the mean, median and mode are: mean: 11,2; median: 11; mode: 11.

📌 Ex3. The marks of 20 students in a test were as follows:

2, 6, 8, 9, 10, 11, 11, 12, 13, 13, 14, 14, 15, 15, 15, 16, 16, 18, 19 and 20.

Calculate:

(i) the mean (ii) the median (iii) the mode

✍ Solution:

Arranging the terms in ascending order:

2, 6, 8, 9, 10, 11, 11, 12, 13, 13, 14, 14, 15, 15, 15, 16, 16, 18, 19, 20

Number of terms=20

(i) Mean

(ii) Median

(iii) Mode =15 as it has maximum frequencies i.e. 3

📌 Ex4. Find the mean, median and mode of the following marks obtained by 16 students in a class test marked out of 10 marks.

0, 0, 2, 2, 3, 3, 3, 4, 5, 5, 5, 5, 6, 6, 7 and 8.

✍ Solution:

(i) Mean

(ii) Median = mean of 8th and 9th term

(iii) Mode =5 as it occurs maximum number of times.

📌 Ex5. A small firm employs 9 people. The annual salaries of the employees are:

a) Find the mean of these salaries.

✍ Solution:

b) Find the mode.

solution:

The mode is USD 10,000 (this value occurs 3 times in the data set).

c) Find the median.

solution:

First order the data. To make the numbers easier to work with we will divide each one by 10,000.

The ordered set is {8; 9; 10; 10; 10; 12; 20; 25; 60}×USD1000.

The median is at position 5 and is USD 10,000.

💪 The Mean, Median and Mode (more examples) (Measures of Central Tendency)