**Drawing Box Plots**

Box Plots are another way of representing all the same information that can be found on a Cumulative Frequency graph.

__Top tip__: if you have the chance, draw your box plot directly below your cumulative frequency graph, using the same scale on the x axis, and you can just extend the vertical lines downwards and save yourself a lot of time!

Note: The minimum value is the lowest possible value of your first group, and the maximum value is the highest possible value of your last group

A **boxplot** or **box-and-whisker plot** is a visual display of some of the descriptive statistics of a data set. It shows:

● the minimum value ● the lower quartile ( Q_{1})● the median ( Q_{2})● the upper quartile ( Q_{3})● the maximum value |
These five numbers form the five-number summary of the data set. |

**Box-and-whisker plot**

The box-and-whisker plot is a graphical representation of the five-number summary.

● The left side of the box is the lower quartile

● The vertical line inside the box is the median

● The right side of the box is the upper quartile.

● The two lines on either side of the box extend out to the minimum and maximum values.

Every set of data can be summarised using five numbers. This is called the five-number summary and will be useful to show a set of data in a more visual way.

A five-number summary of a set of data is made up of:

● The lowest value

● The lower quartile (*Q*_{1})

● The median

● The upper quartile (*Q*_{3})

● The highest value

Once these values have been found they can be represented in a box and whisker plot. A basic box and whisker plot are shown at the picture below.

If we can see where the 5 numbers from the five-number summary may be represented (at the beginning of the line, the beginning of the box, the middle of the box, the end of the box and the end of the line).

Although it may not show on those diagram, it is important that these box and whisker plots are drawn to scale. This will become evident in the following worked-example and more examples:

**Worked example: Box and Whisker Plot**

QUESTION

Draw a box and whisker diagram for the following data set:

SOLUTION

**Step 1: Determine the minimum and maximum**

Since the data set is already sorted, we can read off the minimum as the first value (1.25) and the maximum as the last value (5.1).

**Step 2: Determine the quartiles**

There are 12 values in the data set. Using the percentile formula, we can determine that the median lies between the sixth and seventh values, making it:

The first quartile lies between the third and fourth values, making it:

The third quartile lies between the ninth and tenth values, making it:

This provides the five number summary of the data set and allows us to draw the following box-and-whisker plot.

**More Examples about Boxplots**

Ex1. The five number summary of heights of trees three months after they were planted is (23; 42; 50; 53; 75). This information is shown in the box and whisker diagram below.

a) Determine the interquartile range.

b) What percentage of plants has a height excess of 53 cm?

c) Between which quartiles do the heights of the trees have the least

Variation? Explain.

solution:

a) Interquartile range =53-42=11

b) 25%

c) Between

*Q*

_{2}(50) and

*Q*

_{3}(53). The distance between these two quartiles is the smallest

Ex2. Consider the data set: 8 2 3 9 6 5 3 2 2 6 2 5 4 5 5 6

a) Construct the five-number summary for this data.

b) Draw a boxplot.

c) Find the: i. range ii. interquartile range of the data.

d) Find the percentage of data values less than 3.

solution:

a) The ordered data set is:

So the 5-number summary is:

*Q*

_{1}=2.5;

*Q*

_{3}=6}

b)

c) i. Range = maximum – minimum =9-2=7

ii. IQR =

*Q*

_{3}–

*Q*

_{1}=6-2.5=3.5

d) We cannot use the boxplot to answer (d) because the boxplot does not tell us that all of the data values are integers. Using the ordered data set in a, 4 out of 16 data values are less than 3. ∴25% of the data values are less than 3.

Ex3. Nineteen girls were required to complete a puzzle as quickly as possible. Their times (in seconds) were recorded and arc shown in the table below.

19 20 21 21 22 23 24 24 29

a) Identify the median time taken by the girls to complete the puzzle.

b) Determine the lower and upper quartiles for the data.

c) Draw a box and whisker diagram to represent the data.

solution:

19 20 21 21 22 23 24 24 29

a) Because n=19 is odd, median

*x*

_{(½(n+1))}=

*x*

_{(½(19+1))=x10=19}

b) The median cuts the data into two sections. In section I we can get lower quartile and in section II we can get upper one.

We obtain the section I.

Lower quartile =

*x*

_{(½(9+1))}=

*x*

_{5}=17

Then the section II.

upper quartile =

*x*

_{(10+½(9+1))}=

*x*

_{15}=22

c)

Ex4. Lisa is working in a computer store. She sells the following number of computers each month:

Give the five number summary and box-and-whisker plot of Lisa’s sales.

solution:

We first order the data set.

Now we can read off the minimum as the first value (3) and the maximum as the last value (65). Next we need to determine the quartiles.

There are 12 values in the data set. Using the percentile formula, we can determine that the median lies between the sixth and seventh values, making it:

The first quartile lies between the third and fourth values, making it:

The third quartile lies between the ninth and tenth values, making it:

This provides the five number summary of the data set and allows us to draw the following box-and-whisker plot. Five number summary:

Minimum: 3

*Q*

_{1}: l7.5

Median: 27

*Q*

_{3}: 44

Maximum: 65

Box-and-whisker plot:

Ex5. Paul works as a telesales person. He keeps a record of the number of sales he makes each month. The data below show how much he sells each month.

Give the five number summary and box-and-whisker plot of Paul’s sales. Solution: We first order the data set.

Now we can read off the minimum as the first value (1) and the maximum as the last value (60).

Next we need to determine the quartiles.

There are 12 values in the data set. Using the percentile formula, we can determine that the median lies between the sixth and seventh values, making it:

The first quartile lies between the third and fourth values, making it:

The third quartile lies between the ninth and tenth values, making it:

The five number summary is:

Minimum: 1

*Q*

_{1}: 12

Median: 28.5

*Q*

_{3}: 46.5

Maximum: 60

The box and whisker plot is:

Ex6. Determine the five number summary for each of the box-and-whisker plots below.

a)

solution:

The box shows the interquartile range (the distance between Q1 and Q3). A line inside the box shows the median. The lines extending outside the box (the whiskers) show where the minimum and maximum values lie. Reading off the graph we obtain the following five number summary:

Minimum: 15

*Q*

_{1}: 22; Median: 25;

*Q*

_{3}: 28; Maximum: 35

b)

solution:

The box shows the interquartile range (the distance between

*Q*

_{1}and

*Q*

_{3}). A line inside the box shows the median. The lines extending outside the box (the whiskers) show where the minimum and maximum values lie. Reading off the graph we obtain the following five number summary:

*Q*

_{1}: 92; Median: 98;

*Q*

_{3}: 100; Maximum: 101

Ex7. The data below shows the number of laptops sold by 15 sales agents during the last financial year.

a) Determine the median number of laptops sold.

b) Calculate the range of data.

c) Calculate the interquartile range.

d) Draw a box and whisker diagram for the data above.

Teaching notes:

a) Remind learners that to find the median, the data needs to be ordered.

b) = Largest value – smallest value.

c) Find the upper quartile and lower quartile and subtract the lower quartile from the upper quartile.

d) Use the five-number summary and remember to make sure the scale is accurate.

solution:

Rearrange the data:

The median is 54.

b) 90-34=56

c)

*Q*

_{1}=46;

*Q*

_{3}=47→

*Q*

_{3}–

*Q*

_{1}=47-46=27

d) Five-number summary: 34 46 54 73 90

(90-34=56. Suggested scale: 1 cm=5 units)