|
Hey folks, One of the benefits of sending out these newsletters and making my YouTube videos is that I get a ton of practice. I can’t emphasize how much practice has paid off in learning to use dplyr, ggplot2, and other packages. Reproducing published figures has really helped me to dive into parts of ggplot2 that I wouldn’t normally use because I make plots that use the features of ggplot2 that I know. By expanding my knowledge of ggplot2, I’m finding that the plots I make from scratch are more varied and sophisticated than they would normally be. I hope I’m not bragging here - my point is that there’s no reason you couldn’t be doing the same thing. Grab your favorite journal, find a figure, think through what’s standard, what’s novel, and how you would approach the figure. Then go through the process of creating it yourself. Finally, change what you don’t like in a new version of the plot. I’d love it if you shared with me any of your (re)creations! This week, a figure by Alessandro Lo Sciuto and colleagues caught my eye in their paper, “A molecular comparison of [Fe-S] cluster-based homeostasis in Escherichia coli and Pseudomonas aeruginosa”. The paper was published in mBio. The paper looks at the iron limitation physiology of E. coli and P. aeruginosa and how they use iron-sulfur clusters. The figure that I was interested in was Figure 2C This set of figures describes the sensitivity of P. aeruginosa to stress, specifically antibiotic stress. The wild type strain (PAO1) was sensitive to these drugs. They created a mutant that had the iscU gene deleted but added back with an arabinose-inducible promoter. When there was no arabinose in the media (ΔiscU ParaiscU) the strain was also sensitive. But when they added arabinose to the media, the strain was resistant (ΔiscU ParaiscU+ (+0.5% ARA)). That’s enough biology :) What type of figure is this? What do you think is interesting about how it was created? Is there any overlap with what these authors did with what you try to do with your figures? How much of this figure do you think you could figure out on your own? What would you be interested in learning to do? For discussion, let’s assume the data comes as a data frame with a column for the name of the First off, these four plots are all line plots with time in hours across the x-axis and cell density (CFU/mL) on the y-axis. They tested four antibiotics including gentamicin, ofloxacin, meropenem, and colistin. The legend tells us that the experiments were done in triplicate and that the mean is presented with error bars representing the standard deviation. For each plot in this panel, I’d create the general appearance using Second, I notice that the y-axis is on a log-10 scale. We can get a log scale by using Third, normally I would draw a line to indicate where the limit of detection was using Fourth, sticking with that y-axis, I noticed that the y-axis labels are 5 times ten to a power (e.g., 5x106) rather than the typical 1 times ten to a power (e.g., 106). They also have minor tick marks. The minor tick marks are a common feature in log scaled axes, but I realized I rarely see these in R plots. They certainly haven’t ever been on any of my plots. We can adjust both of these in Finally, they appear to have generated these plots as four separate plots. This gets us redundant x and y-axis titles and text. Regardless, we can use our friend the patchwork package to assemble the four plots into a single figure. It is interesting that they have a common legend for the four plots at the bottom of the figure. I know there’s a vignette on the patchwork website showing how to share legends across figures. I’d check that out to duplicate their legend. Of course, there’s a number of small details that I’m skipping over here. Things like how they have a greek letter, subscripts, italics in their strain names or how they title of each plot is set off to the right side of the plot. All of these things can be modified adjusting the
|
Hey folks, Earlier this week, those of us in the US celebrated Memorial Day. For many, this marks the unofficial start of summer. I suppose the clock is now ticking until Labor Day, which is the unofficial end of summer. Let me be the jerk to tell you that you have 100 days left to accomplish all of your summer goals. I suspect that for many of you writing papers and putting together conference posters and talks are on your list of goals. Generating attractive visualizations of your data is...
Hey folks, I’ve been getting asked to give more talks about data visualization and my experiences critiquing visualization. It’s been a lot of fun to engage with live audiences. I enjoy learning about their experiences, motivations, and limitations. As much as I love this newsletter and the content I post to YouTube, it’s clear that it isn’t a substitute to talking to people without the filter of email or a chat box. So, if you’re interested in working with me on an individual or group level...
Hey folks, The more I peruse the literature, the more I see that researchers need help designing figures to help tell their stories. I don’t just mean the mechanics of creating a figure in R, Python, Prism, or Excel. Rather, if someone had a box of dry erase markers of various colors and they had to give a talk without any slides, what would they draw to tell their story? I don’t mean to trivialize the difficulties. It’s hard! There are many figures I’ve published that I wish I could have a...