Hey folks, I’d love to hear your experiences trying to recreate the figures I’ve been discussing in recent newsletters. Does a “verbal” description of my thought process for each figure help? Can you pick a figure and do it yourself? What are the biggest obstacles to translating between the verbal description and actual code? Feel free to reply to this email to let me know how you like this approach. Also, if you have a figure you’d like me to walk through, I’d love that too! This week I want you to look at Figures 3E and 3F from “Preexisting cell state rather than stochastic noise confers high or low infection susceptibility of human lung epithelial cells to adenovirus” by Anthony Petkidis and colleagues, which was recently published in mSphere. There isn’t anything super special about this set of figures. I’m more interested in the general style of the figures that I want to draw your attention to. You’ll notice that the x-axis in both figures is broken. As I go looking for papers, I find broken axes like these in a lot of papers. Often they are broken y-axes, but as this case shows, people break the x-axis too. Why do people break an axis? In this case, we see that there was a jump in the data between weeks 4 and 8. Instead of having 3 empty positions in their figures, the cut that out. When people break the y-axis, they often have a big difference between the values for different treatments. Perhaps A is super high and B and C are much lower, but C is greater than B. The author wants to call attention to the relationship between all three treatments. Why shouldn’t you break an axis? The honest truth is that breaking axes is generally considered a poor data visualization practice. That’s because although the axis is clearly indicated as broken, the human eye and mind will quickly forget that and make comparisons between points based on their distance to each other. How would you get around the need for a break? One thing these authors could have done would have been to make facets of the early and later time points and drawn boxes around both groups. They’d share a y-axis, but there’d be a stronger indication that there’s a jump in the data. For a break in the y-axis, you might consider a log scaled axis or some other transformation that compresses the difference in the data. Alternatively, you might set the limit on the y-axis to highlight the difference between B and C and let the value for A either be hidden or for a line or bar plot it could extend outside the plot with an annotation indicating the value of A. You might also ask if A is so much larger than B and C, does the difference between B and C really matter? But what do I know? :) How would I go about creating a break in an axis? First of all, I’m so sure there’s a package out there to do this for you that I’m not going to bother with the google search. Again, my goal with these discussions isn’t to solve specific problems, but to help you think more generally about how to solve problems with R. Let’s start by assuming we have a data frame that looks something like this…
Of course, we would plot this with My general idea is that I need to pull the data together on the x-axis with space for one piece of missing data. I can pull everything together by recasting the Now we have pulled days 8 and 9 back towards day 4 with a gap in between. We’d like to get rid of the 5 on the x-axis. We can probably do that with Next, I’d apply If you get that to work right, you’ll notice a problem. Do you see the problem? It appears that the data are plotted under the axis rather than on top. What I wanted to do would be to make that line white so that it masks the axis. But because it is under the axis, that strategy won’t work. Now we need another solution. That other solution would be to remove the x-axis entirely and draw a new one made up of two segments. To remove the x-axis you can use the Got it? To be honest, before last week I didn’t know about
|
In case you missed it, I have nine kids ranging in age from 23 to 7 that my wife homeschools. They’re a riot. Each of them has to find a way to be different from all of the others. This makes for some real characters. Let me introduce you to Peter. This week, Peter, who is 11, has been working on a times table. You may remember these from when you were a kid. Say you want to know what 7 times 8 is (this was always my hardest “times” to remember). You take your finger down the rows to the...
Hey folks, I’m really enjoying sharing with you my 30,000 foot view of how I would go about making figures that I find in the “wild”. Following up on these emails with a couple of related YouTube videos has been a lot of fun for me. Of course if you find any figures you like, send them my way - I love seeing what interests you all. I was reminded recently though that not everyone feels enough confidence with their R and tidyverse skills to keep up. Sorry! Towards the bottom of this email I...
Hey folks, We’re still slogging our way through Thanksgiving leftovers. As time passes from last Thursday, there’s a fine line between setting a good example about not wasting food and setting a bad example by getting every food poisoning! Speaking of eating, our teeth are pretty important, don’t you think? In the US, Trump’s expected head for the Department of Health and Human Services has a number of interesting views about health. One example is that its a bad idea to spike our drinking...