Hey folks, This week I hosted the first live ensemble programming session. It went really well. We had fun and learned a lot. If you’d like to get in on these types of sessions, let me know and I’ll be sure you get a special invitation for the next series. I really believe that this form of instruction is critical to making the material learned in compact workshops stick for the long term. I hope you had fun working with the broken axis chart last week! This week I want you to look at Figures 5 of “Strategies for effective high pressure germination or inactivation of Bacillus spores involving nisin ” by Rosa Heydenreich and colleagues, which was recently published in Applied and Environmental Microbiology. You probably would like a little context. This is from a paper looking at using pressure to get bacteria to form spores or leave the spore state. The analysis was done before and after a heat treatment (as indicated in the legend) using four different methods (across the x-axis). They measured the number of spores observed for each condition and expressed it as the log fraction of the number of the number of spores put into the experiment (No = 10^9). The error bars indicate the standard deviation across at least three independent experiments. What type of plot is this? What stands out to you about this figure? What do you like about it? What don’t you like about it? Can you outline the steps you would take to generate the figure? What are some of the steps you aren’t sure about and would like to learn? These are questions that I’d strongly encourage you to ask about any visual you are looking at because I think they’ll help you to develop your “taste” in data visualizations and strengthen you skills in generating those visualizations. This is a bar plot. Here are five things that caught my eye. First, this bar plot has it’s x-axis at the top and descends into negative log values. Second, they have hashing in the bars for the “after heat” category. Third, their legend is below the plot, has italics, and has a box around it. Fourth, they only have horizontal grid lines with a thicker, dashed grid line to indicate the limit of detection at -8. Finally, I noticed that the tick marks move into plot rather than default of plot. Here’s some data for you to experiment with:
First, let’s talk about the bar plot. You may be tempted to use The second eye catcher is that they have diagonal lines for the bars representing what happened after the heat treatment. I think this general look comes to us from many years of using M$Excel. My personal preference would be to leave out the diagonal hashing since I think it unnecessarily clutters the bars. Why not use the two shades of blue and call it a day? Anyway, there is a cool looking Third, they were able to format their treatment categories so that they could nicely tuck the legend on the left side of the axis. How’d they do that? I’d likely use Fourth, they have done some interesting things with their grid lines. If you use the Finally, the plot is doing interesting things with the x-axis ticks by having them go into the plot and by removing them from the y-axis. How would you do that? If your mind went to There’s a lot of cool stuff going on in a relatively simple plot! I’m not sure what software they used to make this plot, but it has some really nice points. The more I looked at this figure, the more things I noticed are different from the default As always if you have a cool plot you’d like to share with me for a future newsletter, feel free to reply to this email. Oh yeah, that
|
Hey folks, If you’re interested in participating in a 1-day (6 hours) data visualization workshop, you’re running out of time to register. I’ll be teaching this workshop on May 9th. I will cover an introduction to the ggplot2 package and will assume no prior R knowledge. My goal is to help you to understand the ggplot2 framework and begin to apply it to make some interesting and compelling visualizations. After this workshop, you should be able to learn more advanced topics on your own. You...
Hey folks, I’m gearing up to teach a 1-day (6 hours) data visualization workshop on May 9th. This workshop will cover an introduction to the ggplot2 package and will assume no prior R knowledge. My goal is to help you to understand the ggplot2 framework and begin to apply it to make some interesting and compelling visualizations. From this workshop, I hope that you would be able to go off on your own journey learning more advanced topics. You can learn more and register by clicking the button...
Hey folks, Long time friends of Riffomonas know that I’ve been teaching data science classes for close to 20 years. The hallmark of my teaching has been three-day workshops where I either teach R (here and here) or the mothur software package. I’ve gotten feedback that three days is just too much time for people to carve out of their busy schedules. So, I’m excited to be offering a 1-day (6 hours) data visualization workshop on May 9th. This will cover an introduction to the ggplot2 package....