Leverage your experimental design to improve the visualization of your data

Hey folks,

I’ve been getting asked to give more talks about data visualization and my experiences critiquing visualization. It’s been a lot of fun to engage with live audiences. I enjoy learning about their experiences, motivations, and limitations. As much as I love this newsletter and the content I post to YouTube, it’s clear that it isn’t a substitute to talking to people without the filter of email or a chat box. So, if you’re interested in working with me on an individual or group level to improve your data visualizations, let me know. I provide free 30-minute exploratory meetings to discuss how we might work together to design the figures for your next paper, talk, or poster. You can sign up by clicking the button below!

The talks I’ve given have forced me to synthesize my observations about common challenges people seem to face visualizing their data. I don’t just mean how to do A with tool X (or R!). Rather, how to visually translate what they are trying to say with words. Surprisingly, arranging treatment groups in a way that facilitates the comparison they want to make is something that many find to be challenging. In last week’s panels I showed how the authors wanted us to compare panels i, k, and m to each other. Why not make those panels i, j, and k? Why not put the data from those three panels into one panel? Doing so would have made it so much easier to compare the data in those three panels.

This week I have a panel with a similar problem. The authors actually had a “built in” way to link their data but didn’t take advantage of it. Here’s panel d from Figure 3 of the paper “Acarbose redirects gut microbiome utilization of dietary carbohydrates to suppress anaphylaxis in mice” which was recently published in Nature Microbiology.

In this experiment the authors had four groups of 8 mice that for 21 days they sensitized to ovalbumin. For the next 7 days they either (i) left untreated (i.e., Ctrl), (ii) treated with acarbose (i.e., Acr), (iii) treated with antibiotics (i.e., Abx), or (iv) treated with acarbose and antibiotics (i.e., Abx + Acr). They obtained fecal pellets from the 32 mice at 21 (i.e., Before) and 28 days (i.e., After).

As you can see by the comparison bars in this figure they don’t really know what they should be comparing. They should be comparing the same group (e.g., Abx + Acr) between the two time points. The only group they do this for is the acarbose group. The three other comparisons they made were between the after control and the acarbose, antibiotics, and acarbose + antibiotics groups.

I’m not sure why they did it this way. Their layout causes it to be more challenging to compare the same group across time points. Also, I’m positive they used the incorrect test. The caption indicates they used a “one-way ANOVA with Dunn’s multiple-comparison post hoc test”. They should have used a paired t-test. This likely would have given them smaller p-values. Let me explain each of these points.

First, the best control for each of the treatments in the after group was the same treatment in the before group. The samples were collected from the same mice. If I was in a study looking at the impact of antibiotics in humans, the best control for me after antibiotics would be me before antibiotics - not a separate group of individuals not receiving antibiotics. Sometimes this is necessary in a retrospective study. But not here where they collected before and after samples from each animal. So put those data next to each other. It helps the audience, but it also does a better job of reflecting how the experiment was performed. If you are drawing comparison bars across multiple groups to indicate the comparison you want to show, consider whether those groups could actually be next to each other.

Second, a paired test is preferred to a one-way ANOVA. Again, because we have before and after data we can test the change in the Shannon index, which is what the authors are interested in. It’s subtle, but the test they performed tested for a difference in the Shannon index rather than a change. The paired test would be a more powerful test because the test effectively controls for the initial variation in Shannon indices across animals. Also, they don’t seem interested in the comparison across the four groups, so I’m not sure the post hoc test is necessary. The comparisons they made could be done with 4 paired t-test. Alternatively, if they were interested in comparing the change in the Shannon index between the 4 treatment groups, then they could have done the one-way ANOVA with the test for multiple comparisons using the before and after differences in Shannon indices.

My suggestion for refactoring this panel would include several steps. First, I would put the four treatment groups across the x-axis. Within each treatment group I would dodge the before and after points and likely jitter the individual points. Second, I would use a different shape or color for the before and the after points. Third, I would draw a segment connecting the before and after point for each animal. Aside from making it easier to see the comparisons, this design would better reflect the experimental design. Furthermore, by connecting the points, it would become much easier to see whether there was a downward trend in the Shannon index.

Dang. I think that's a much better figure. Wouldn't you know it, the p-values are smaller as well!

I’ve shared panel d from this figure with you to illustrate these points. But, there are several other panels consisting of an ordination, a stacked bar plot, and several box plots that all suffer from similar issues. I encourage you to check out the other panels and see if you can’t draw what these plots would look like by arranging the data to more easily compare the before and after points.

Workshops

I'm pleased to be able to offer you one of three recent workshops! With each you'll get access to 18 hours of video content, my code, and other materials. Click the buttons below to learn more

minimalR Workshop

generalR Workshop

mothur Workshop

In case you missed it…

Here is a livestream that I published this week that relate to previous content from these newsletters. Enjoy!

Finally, if you would like to support the Riffomonas project financially, please consider becoming a patron through Patreon! There are multiple tiers and fun gifts for each. By no means do I expect people to become patrons, but if you need to be asked, there you go :)

I’ll talk to you more next week!

Pat

Riffomonas Professional Development

Leverage your experimental design to improve the visualization of your data

Workshops

In case you missed it…

Keep your writing and your visuals simple

What is causing the large number of figures in modern papers?

Creating a dendrogram that spans multiple facets

Riffomonas Professional Development

Leverage your experimental design to improve the visualization of your data

Workshops

In case you missed it…

Riffomonas Professional Development

Keep your writing *and* your visuals simple

What is causing the large number of figures in modern papers?

Creating a dendrogram that spans multiple facets

Keep your writing and your visuals simple