Hey folks, Earlier this week I sent out an email announcing a new interactive training opportunity. The goal is to provide greater opportunities to hone your skills in a social setting. My experience with leading this approach has been excellent. I can’t wait to have you give it a try with me. Please let me know if you have any questions. Let’s continue on with our efforts to develop intuition about how to recreate plots that we see out in the wild! This week, I found an interesting box and whisker plot in the paper, “Unveiling the importance of heterotrophy for coral symbiosis under heat stress”, published in the journal mBio by Stephane Martinez and colleagues. Their Figures 1 and 2 are the same type of figure. Let’s look at Figure 1 together. I’ll let you wrestle with Figure 2 on your own. Here’s Figure 1: What’s going on in this plot? As we can see the figure has two panels, A and B. These panels are analogous - they’re both box and whisker plots. These plots are great for displaying data that are not normally distributed. For those of you unfamiliar with this type of plot, the black horizontal line across each rectangle (i.e., the “box”) represents the median (i.e., the 50th percentile) and the top and bottom edges of each box represent the 25th and 75th percentiles. The difference between the 27th and 75th percentiles is the inter-quartile range (IQR). The bars extending upwards from the boxes (i.e., the “whiskers”) will extend to an observed point much as 1.5 times the IQR. In the bottom panel, the “32 light” data has a point above the very short whiskers. This is because there was a point just outside the 75th percentile and another point more than 1.5 times the IQR at about 5 on the y-axis. This is an outlier. The stars between pairs of treatments tells us that there was a statistically significant difference between the treatments indicated by the brackets under each star. My suspicion is that the researchers started with a data frame, How would we go about taking this data to generate the two panels? Let’s make them as two separate figures. The easiest way to make the box and whisker plot - or just “boxplot” - is to use One thing to note is that the default fill for the boxes will be white. To get them to be gray, we need to use Of course you can set the labels on the x and y-axes using the There are a couple of other To complete each of the panels, we now need to put the comparison brackets and stars on each comparison. Most people would hunt for a package to do this for them. Because I’m stubborn and like to practice using {ggplot2}, I’d draw them myself. I’d create the brackets using Finally, most people would stop here and assemble the two figures in PowerPoint or some other monstrosity to reproducible research. But we are not most people. Are we?! After saving each figure to its own variable name (e.g., Here’s some code to generate
|
Hey folks, Did you know that March is Women’s History Month? Each year The Economist updates what they call the “Glass Ceiling Index”. This is a measure of “the role and influence of women in the workforce”. It’s an aggregate of ten factors including the gender gap in wages, work force participation, and higher education. Sadly, the article is behind a paywall. They also haven’t made their data publicly available. Regardless, you can get a static copy of the article through archiv.is. Here’s...
Hey folks, This has been a busy week! I’ve been on campus teaching a 3 day, all day, R class. It’s been a while since I’ve done one of these live workshops off campus. If you’re interested in me coming to your campus, you coming to Michigan, or being in a Zoom-based workshop, please let me know! I really love being able to interact with you all in workshops. If your experience has been at all like my own the past month or so, your conversations have all had a tinge of anxiety about the...
Hey folks, I really hope you enjoyed the series of newsletters and videos of me recreating the visualizations presented by W.E.B. DuBois at the 1900 Paris Exposition. I can’t express how much I enjoyed making them. Some of them were pretty tricky and required a lot of work. But I think it was worth it! It definitely forced me to use some new-to-me tools like geom_polygon() and geom_sf(). Please let me know what you thought of the series! I wonder if there’d be any interest in a companion to...