Box plots show the distribution or variation of a measure across multiple categories, groups or time intervals.

Typical box plot

Box plots are typically a rectangular shape (like a bar), with a line somewhere within the box that represents the centre of a distribution of data values – typically the median. The length of the box represents the range of values in the distribution, with the upper bound equalling the highest value and the lower bound equalling the lowest value. Thus, long boxes indicate a wide distribution of data values for that measurement. Boxes can be drawn horizontally or vertically.

Box plots are used in many scientific fields for multiple trials of the same experiment, to show consistencies (or inconsistencies), and multiple trials of different experiments, to show changes, trends or patterns.

Box plots may have lines extending from the boxes called whiskers. Box-and-whisker plots are a specific kind of box plot that show the distribution of data values with box points – the first middle point (median) and the middle points of the 2 halves (25th and 75th percentiles) – and also show other values, such as the minimum and maximum of all the data:

A box-and-whisker plot showing the median, 25th and 75th percentiles and the minimum and maximum data values.

Alternatives to box plots

If the distribution or variation of a measure is not a key message of a graph, and mean or median values will sufficiently and accurately convey the necessary information (e.g. change over time or differences across groups), consider presenting these summary values in ways that are more easily understandable to readers. These are horizontal bar graphs for discrete groups, vertical bar graphs for small time series and line graphs for extended time series.

Box plot versus bar or line graph

Typically, box plots should only be considered when more than 1 data point is available for the group or time point, and it is misleading or inappropriate to simply display summary data, such as mean or median values. Box plots can also be considered when the distribution or variation of data for a group or time point is of particular importance to the data message – for example, when there are differences in data variability over time or across groups.

Caution! Box plots are unfamiliar to most nonscientific readers, as well as many scientific disciplines that do not routinely use this graph type. When deciding whether to use box plots, weigh reader experience against the nature of the data message you want to convey.