Presenting Data in Histograms – Statistics

DISPLAY OF NUMERICAL DATA

From previous courses you should be familiar with column graphs used to display discrete numerical variables.

When data is recorded for a continuous variable there are likely to be many different values. We organise the data in a frequency table by grouping it into class intervals of equal width.

A special type of graph called a frequency histogram or just histogram is used to display the data. This is similar to a column graph but, to account for the continuous nature of the variable, a number line is used for the horizontal axis and the ‘columns’ are joined together.

Column graphs and frequency histograms both have the following features:
● The frequency of occurrence is on the Vertical axis.
● The range of scores is on the horizontal axis.
● The column widths are equal and the column height represents frequency.
The modal class, or class of values that appears most often, is easy to identify from the highest column of the frequency histogram.


CASE STUDY

While attending a golf championship, I measured how far Ethan, a professional golfer, hit 30 drives on the practice fairway. The results are given below in metres:

To organise the data, we sort it into groups in a frequency table.

When forming groups, we find the lowest and highest values, and then choose a group width so that there are about 6 to 12 groups. In this case the lowest value is 244.6 m and the highest is 277.5 m. If we choose a group width of 5 m, we obtain eight groups of equal width between values 240 m and 280 m, which cover all of the data values.

Suppose d is the length of a drive. The first group 240≤d<245 includes any data value which is at least 240 m but less than 245 m. The group 260≤d<265 includes data which is at least 260 m but <265 m. We use this technique to create eight groups into which all data values will fall.


From this table we can draw both a frequency histogram and a relative frequency histogram:


We can see that the modal class is 260≤d<265. The advantage of the relative frequency histogram is that we can easily compare it with other distributions with different numbers of data values. Using percentages allows for a fair comparison. Each graph should have a title and its axes should be labelled. Example 1
A sample of 20 juvenile lobsters is randomly selected from a tank containing several hundred. Each lobster is measured for length (in cm) and the results are:
4.9, 5.6, 7.2, 6.7, 3.1, 4.6, 6.0, 5.0, 3.7, 7.3,
6.0, 5.4, 4.2, 6.6, 4.7, 5.8, 4.4, 3.6, 4.2, 5.4

a) Organise the data using a frequency table, and hence graph the data.
b) State the modal class for the data.
c) Describe the distribution of the data.
Solutions:
a) The variable ‘the length of a lobster’ is continuous even though lengths have been rounded to the nearest mm. The shortest length is 3.1 cm and the longest is 7.3 cm, so we will use class intervals of length 1 cm.


b) The modal class is 4≤l<5 cm as this occurred most frequently. c) The data is positively skewed.

Histograms

● Histograms and frequency polygons are graphs used to represent grouped and continuous data. They show the frequency and the distribution (spread) of the data.
● Continuous data is data that is notjust measured in whole numbers. For example, length, mass, volume or time are measured in continuous amounts.
● The horizontal axis of a histogram and a frequency polygon have a continuous scale.
● The vertical axis shows the frequency, or number of times the data is listed.

1. Grouped data:
Instead of recording every piece of data separately, we can group the data to make it easier to read. Grouped data can be represented on a histogram or a frequency polygon.
Example 2
A grocer wants to record the mass of each packet of chicken pieces he sells. He groups the masses into intervals of 0,2 kg. He makes a frequency table.


2. Drawing a Histogram
From the frequency table, he draws up a histogram.

note: A histogram is a graphical display of data using bars of different heights. It is similar to a bar graph but for a histogram there are no gaps between the bars.
x-axis: use upper limit of the interval
y-y-axis: frequency

Big Example

to the right is a table showing the length of applause after Mr Osas announces that there will be no homework tonight. Construct a Bar Chart and a Histogram, and comment on the differences

1. Drawing a Bar Chart (Frequency Diagram)
Note: Sometimes bar charts are called Frequency Diagrams!
1. Decide on an appropriate scale to fit the paper you are working with – as a general rule, the bar chart (or any statistical diagram, for that matter) should take up between half and three-quarters of the space you have to work with.
Crucial: Your numbers must go up in equal steps!… see Scatter Diagrams for examples of some very dodgy scales!
2. Label your axes. Is it usual to put frequency (total) on the y axis, and whatever the data is along the x axis.
3. Carefully draw in your bars… and add a title!
Note: In examples like this where the groups are numbers, then it is usual to have the bars touching each other

If the groups were categories (such as “colour of cars”), then you could have gaps between your bars if you like!
A Bar Chart to show the Length of Applause after Mr Osas says “no homework”


2. Drawing a Histogram
The major difference between a bar chart and a histogram is what goes on the y axis
On a Bar Chart it is Frequency
On a Histogram it is… Frequency Density!
1. Add two extra columns to your table… Group Width and Frequency Density
2. To work out Group Width, wejust do the upper limit of each group minus the lower limit
3. To work out Frequency Density, we use this lovely formula:

4. We then plot frequency density on the y axis, and our data on the x axis as before
Note: In Histograms you have no choice… The bars must always be touching!

A Histogram to show the Length of Applause after Mr Osas says “no homework”


Note: Sometimes you get asked to draw a Frequency Polygon. Do not fear, this is just a Histogram, but with the top mid-points of each bar joined together with straight lines!

3. What is the Point of Histograms?
● Have a quick look back at the two diagrams
● On the Bar Chart, which group looks like it has the most people in it?… maybe the last one?… because it takes up such a large area compared to the other groups… but if you look on the table of dam, this group only had a total of 5!
So… Bar Charts can be deceptive
● But look at the Histogram…the group that looks the biggest now is the “between 2 and 3 minutes” group… and this is the group that has the highest frequency!
● The reason for this is that in a Histogram, the areas of the bars are proportional to the frequency, and not just the height like a bar chart.
Note: If all our groups had the same width, it wouldn’t matter, but often they do not, so that is why Histograms tend to be used more than Bar Charts!

RELATED POSTs

Leave a Reply

Your email address will not be published.