Good Taste in Dataviz


Good Taste in Dataviz

How to make your data visualization not only good-looking but also comprehensive? 

A great number of rules and recommendations have been formulated. And one of the most important can be described as the following:

The number of information-carrying dimensions displayed should not exceed the number of dimensions in the data.

The following figure illustrates the number of votes for A and B. The data has only one dimension – a number of votes and the visualization has only one dimension as well. 

Figure 1. The number of votes. Column chart.
Figure 1. The number of votes. Column chart.

If to add one more dimension to the chart, the same data may look like this:

Figure 2. The number of votes. Pie chart.
Figure 2. The number of votes. Pie chart. 

It is much more difficult for us to compute the difference in the values represented in such a way. However, in this particular example, it is still possible. But what about the next figure?

Figure 3. Cities of Europe that are or have been capitals of empires, kingdoms, or republics, represented in order of their population by William Playfair.

Not so accurate, isn’t it?  

In his book A Handbook For Data Driven Design Andy Kirk introduces the following example: 

Figure 4. Comparison of Judging Line Size vs Area Size.
Figure 4. Comparison of Judging Line Size vs Area Size.

Guess how big B is if A is 10? In both cases the answer is B equals 5. Although it is not so obvious with the circles in comparison to the bar chart. Our visual system performs better relative judgments for a line, in comparison with a shape.

What about donut charts? Also a big no-no! With a donut chart, we are forcing the audience to compare one arc length to another arc length. Do you really think it is easy? 

Figure 5. The number of votes. Donut chart.
Figure 5. The number of votes. Donut chart. 

Use a bar chart instead.

Figure 6. The number of votes. Bar chart.
Figure 6. The number of votes. Bar chart.

It is hard for humans to attribute value to two-dimensional space. Whenever possible avoid using area graphs. 

When shall we use a two-dimensional space? Yes, when you have two-dimensional data.

Figure 7. The number of votes. Line chart.
Figure 7. The number of votes. Line chart.

In Figure 7 we see how the number of votes changes over time starting from Q1 to Q4. We have two dimensions in the data: the number of votes and the time. Therefore, two information-carrying dimensions in the chart are just alright.

What can be worse than a two-dimensional space illustrating one-dimensional data? A three-dimensional space! Using 3D, when you have only one dimension of data, distorts the viewer’s ability to judge values with any degree of acceptable accuracy.

Figure 8. The number of votes. 3D donut chart.
Figure 8. The number of votes. 3D donut chart.

As you may see in Figure 8, 3D makes it much harder to form accurate judgments. The tilting of the isometric plane amplifies the front part of the chart and diminishes the back. The actual value of A is 44, and B is 35. But can you visually confirm that A is greater than B?

Cole Nussbaumer Knaflic in her book Storytelling with Data suggests the following example to illustrate the issue:

Figure 9. 3D Column chart.
Figure 9. 3D Column chart.

The 3D column chart in Excel (Figure 9) determines the bar height by an invisible tangent plane intersecting the corresponding height on the y‐axis. What are the values in January and February? If you compare the bar height to the grid lines and follow it leftward to the y‐axis,  you may possibly estimate visually a value of 0.8. It is hard to believe but the values plotted are 1 and 1.

Try to read the values in the following figure. Is it possible?

Figure 10. 3D bar chart.

A very bad visualization. 

3D charts are not an effective choice for trustworthy practices. Do not use 3D!

A possible legitimate application of 3D visualization is demonstrated by the piece shown in Figure 11. Trajectories for every home run scored by Kris Bryant of the Chicago Cubs during 2017, including the height, distance, and landing position of each shot.

Figure 11. Three dimensions of data (Baseball home run trajectories) in a 3D space. A Handbook For Data Driven Design Andy Kirk
Figure 11. Three dimensions of data (Baseball home run trajectories) in a 3D space.
[A Handbook For Data Driven Design Andy Kirk]

Conclusion: the use of more than one dimension to show one-dimensional data is an ineffective technique. It might work on very small data sets. However, it often contains errors or leads to ambiguity in perception, and therefore should be avoided.