Hey folks, One of the benefits of sending out these newsletters and making my YouTube videos is that I get a ton of practice. I can’t emphasize how much practice has paid off in learning to use dplyr, ggplot2, and other packages. Reproducing published figures has really helped me to dive into parts of ggplot2 that I wouldn’t normally use because I make plots that use the features of ggplot2 that I know. By expanding my knowledge of ggplot2, I’m finding that the plots I make from scratch are more varied and sophisticated than they would normally be. I hope I’m not bragging here - my point is that there’s no reason you couldn’t be doing the same thing. Grab your favorite journal, find a figure, think through what’s standard, what’s novel, and how you would approach the figure. Then go through the process of creating it yourself. Finally, change what you don’t like in a new version of the plot. I’d love it if you shared with me any of your (re)creations! This week, a figure by Alessandro Lo Sciuto and colleagues caught my eye in their paper, “A molecular comparison of [Fe-S] cluster-based homeostasis in Escherichia coli and Pseudomonas aeruginosa”. The paper was published in mBio. The paper looks at the iron limitation physiology of E. coli and P. aeruginosa and how they use iron-sulfur clusters. The figure that I was interested in was Figure 2C This set of figures describes the sensitivity of P. aeruginosa to stress, specifically antibiotic stress. The wild type strain (PAO1) was sensitive to these drugs. They created a mutant that had the iscU gene deleted but added back with an arabinose-inducible promoter. When there was no arabinose in the media (ΔiscU ParaiscU) the strain was also sensitive. But when they added arabinose to the media, the strain was resistant (ΔiscU ParaiscU+ (+0.5% ARA)). That’s enough biology :) What type of figure is this? What do you think is interesting about how it was created? Is there any overlap with what these authors did with what you try to do with your figures? How much of this figure do you think you could figure out on your own? What would you be interested in learning to do? For discussion, let’s assume the data comes as a data frame with a column for the name of the First off, these four plots are all line plots with time in hours across the x-axis and cell density (CFU/mL) on the y-axis. They tested four antibiotics including gentamicin, ofloxacin, meropenem, and colistin. The legend tells us that the experiments were done in triplicate and that the mean is presented with error bars representing the standard deviation. For each plot in this panel, I’d create the general appearance using Second, I notice that the y-axis is on a log-10 scale. We can get a log scale by using Third, normally I would draw a line to indicate where the limit of detection was using Fourth, sticking with that y-axis, I noticed that the y-axis labels are 5 times ten to a power (e.g., 5x106) rather than the typical 1 times ten to a power (e.g., 106). They also have minor tick marks. The minor tick marks are a common feature in log scaled axes, but I realized I rarely see these in R plots. They certainly haven’t ever been on any of my plots. We can adjust both of these in Finally, they appear to have generated these plots as four separate plots. This gets us redundant x and y-axis titles and text. Regardless, we can use our friend the patchwork package to assemble the four plots into a single figure. It is interesting that they have a common legend for the four plots at the bottom of the figure. I know there’s a vignette on the patchwork website showing how to share legends across figures. I’d check that out to duplicate their legend. Of course, there’s a number of small details that I’m skipping over here. Things like how they have a greek letter, subscripts, italics in their strain names or how they title of each plot is set off to the right side of the plot. All of these things can be modified adjusting the
|
Hey folks, I’ve now produced three livestream videos. What do you think? Do you watch them live or watch them later? Or are they too long? I’m looking for honest feedback! I have to admit that if I hadn’t livestreamed these videos, they would not have been produced. It’s nice that I can more or less record and post without any editing. This is still a bit of an experiment. I think fewer people are watching the episodes which makes me worry that this might be an overall step backwards for you...
Hey folks! Do you ever get that feeling where you’re scared to try something? But then you do it anyway… and it turns out way better than you expected? Well that was me on Wednesday morning. I ran my first livestream on YouTube recreating a ridgeline plot from Our World in Data showing the US baby boom. I wrote about it here in the newsletter back in May. The full session was about 2.5 hours. YouTube tells me that 272 people popped in at some point during the session. To be honest, I really...
Hey folks, I need your feedback on an idea! Don’t worry, there’s some visualization stuff at the bottom. I had a video nearly ready to post this week using a ridgeline plot to show the baby boom. I think I did a great job of recreating the plot. But through a series of unfortunate events, I lost the video. I actually recorded the video three times because my computer kept crashing as I was recording it. This was on top of increasing busyness on my part with teaching, proposal writing,...