● Data refer to the pieces of information that have been observed and recorded, from an experiment or a survey.
● Quantitative data are data that can be written as numbers. Quantitative data can be discrete or continuous.
● Qualitative data are data that cannot be written as numbers. There are two common types of qualitative data: categorical and anecdotal data.
● mean. The mean of a data set, no, denoted by x̄, is the average of the data values, and is calculated as:
● The mean is the sum of a set of values divided by the number of values in the set.
● median. The median is the centre data value in a data set that has been ordered from lowest to highest
● The median of a data set is the value in the central position, when the data set has been arranged from the lowest to the highest value. If there are an odd number of data, the median will be equal to one of the values in the data set. If there are an even number of data, the median will lie half way between two values in the data set.
● mode. The mode is the data value that occurs most often in a data set.
● The mode of a data set is the value that occurs most often in the set.
● An outlier is a value in the data set that is not typical of the rest of the set. It is usually a value that is much greater or much less than all the other values in the data set.
● Continuous quantitative data can be grouped by dividing the full range of values into a few sub-ranges. By assigning each continuous value to the sub-range or class within which it falls, the data set changes from continuous to discrete.
● Dispersion is a general term for different statistics that describe how values are distributed around the centre.
● The range of a data set is the difference between the maximum and minimum values in the set.
● The pth percentile is the value, v, that divides a data set into two parts, such that p% of the values in the data set are less than v and 100-p% of the values are greater than v. The general formula for finding the pth percentile in an ordered data set of n values is
For more information about percentiles, let’s read the post Explaining Quartiles with Percentiles.
● The quartiles are the three data values that divide an ordered data set into four groups, where each group contains an equal number of data values. The lower quartile is denoted Q1, the median is Q2 and the upper quartile is Q3.
● The interquartile range is a measure of dispersion, which is calculated by subtracting the lower (first) quartile from the upper (third) quartile. This gives the range of the middle half of the data set.
● The semi interquartile range is half of the interquartile range. The five number summary consists of the minimum value, the maximum value and the three quartiles (Q1, Q2 and Q3).
● The box-and-whisker plot is a graphical representation of the five number summary.