The surging need for representing the variation in changing sets of data has paved the path for a wide range of graphical methods.
Although in most cases, an analysis carried out by histograms offers a sufficient display, a box and whisker plot is increasingly preferred to render additional insights for letting multiple data sets be shown in the same graph.
Box and whisker plots, also regarded as box plots may appear to be primitive when compared to histograms or density plots. These charts take up less space and are useful in instances when comparing distributions between multiple groups or datasets.
In addition, the box and whisker plots can be considered very effective as they are easy to read. They not only offer the ability to summarize data derived from multiple sources but also exhibit the results within a single graph.
Another aspect that makes these plots unique is that they ensure the comparison of data that spans across different categories to offer easier and more effective decision-making.
Box and Whisker Plots
Box and whisker plots can also be touted as a convenient method for visually depicting the data distribution through their quartiles. They are also best known for displaying the five-number summary, inclusive of the minimum, first and third quartile, median, and the maximum of a data set.
In this chart, a box is drawn starting from the first quartile towards the third quartile and then a vertical line traverses through the box at the median. The whiskers go from each of the quartile to the maximum or minimum.
These plots could either be drawn vertically or horizontally. To elaborate further, the lines that extend parallel from the boxes are addressed as whiskers, as they are incorporated to indicate the variability on the outside of the upper and lower quartiles. Also, outliers are at times plotted as individual dots in line with the whiskers.
Types of Box and Whisker Plots
Although the box and whisker plots are shaped to offer information on the shape, variability, and centers of several statistical data sets, they are also widely employed for explanatory data analysis. Mentioned below are the commonly used types of plots.
- Variable-width Box and Whisker Plots
- Notched Box and Whisker Plots
Steps involved in creating a Box and Whisker Plot in Microsoft Excel
The MS Excel application is extensively deployed for the creation and designing of a box and whisker plot, on account of its simplicity and user-friendly interface.
The generation of box and whisker plots in the MS Excel application deals with the execution of a few miniature steps that simply navigate through the workspace with the click of a mouse. The procedure revolves around the calculation of quartile values and the quartile differences from the source data set.
To generate a box and whisker plot in the application, let us take into account a data set comprising scores of students listed in the 2nd column of the Excel sheet as shown below.
The foremost step is to find the values of the minimum and maximum value, the median, the first quarter (Q1) and the third quarter (Q3) by using the inbuilt formulas. The value for the minimum and maximum scores as well as the Median is calculated with the help of the MIN, MAX and Median functions respectively.
While the value for Q1 and Q3 is calculated with the Quartile Function as shown below.
The next step is to calculate the difference between each of the phases. To elaborate this, the new column will enlist the minimum score, the difference between the first quartile and minimum value, the difference between the median and first quartile, the difference between the third quartile and median, and the difference between the maximum value and third quartile. This is as shown below.
Now selecting all the values in the third column, go to the menu bar, click on Insert and from the chart options, choose Stacked Bar Chart to pop up the graph on the screen.
Now, select the graph and navigate to the Switch Row/Column option on the status bar as shown below.
The chart will now appear as follows:
As we do not need the minimum value in making the plot, select it from the graph, and then right click and change its color to No Fill.
The next step is to add the whiskers on the far left and far right. For this, click on the first box and then go to Add Chart Element, then choose Error Bars and then click on More Error Bar Options as shown below.
Once done, in the Horizontal Error Bar options, alter the direction value to Minus, and change the Fixed Value to 133 (the difference between the first quartile and the minimum value) as shown.
Now select the yellow box and repeat the same step as above to add the error bar. However, here in the Horizontal Error Bar options, change the direction to Plus, and make the Fixed Value 256 (i.e. the difference of the maximum value and third quartile) as seen.
Now, similarly, alter the color of the last box to No Fill as pictured below.
The chart will now appear as follows.
The next step is to customize the chart with regards to titles and text fonts as well as highlighted values.
Then the final and most important part in the whole procedure is to change the color of one of the boxes to match the other.
Finally, add borders to both the boxes to differentiate them. For this, select one of the box, right click and then go to Format Data Series. In the Border options, choose Solid Line, increase the line width and change the line color to Black.
Repeat the same process for the other box. The final chart will appear as shown in the below picture.