Do you want to up your data visualization designs in 2026?

Hey folks,

As 2025 is winding down, I want to encourage you to think about your goals for 2026! For many people designing an effective visualization and then implementing it with the tool of their choice is too much to take on at once. I think this is why many researchers recycle approaches that they see in the literature or that their mentors insist they use. Of course, this perpetuates problematic design practices. What if you could break out of these practices? What if you could tell your mentors, colleagues, reviewers, and anyone else what the strengths and weaknesses are of what you are trying to do versus what they are advising you to do?

I have spent a lot of time creating my own plots, critiquing those of others, and reading the ideas of leaders in the field of data visualization. As you know, I’ve shared many of these ideas in this newsletter and in my YouTube videos. I’m excited to work with you more directly. On January 9th (1-4 PM Eastern), I will be offering a 3-hour Zoom workshop introducing you to the principles that drive effective data visualizations in science. There will be no coding in this workshop. Aside from Zoom to watch along, all you’ll need is some paper and a pen - if you have different colored pens you’ll be in even better shape.

What will I talk about? I’ll tell you the importance of aligning your audience and the format with your data visualization. I’ll give you fancy language like pre-attentive attributes to help you talk with your colleagues about your visualizations. You’ll be (re)introduced to the grammar of graphics framework, which will enable you to dissect any data visualization. Finally, I’ll describe strategies to align the form and function of your visualizations.

Data visualization is hard! This interactive workshop will give you greater confidence to design your own visualizations that effectively convey your science to your audience. I’ll lead you through the material by sharing numerous examples from the popular media and scientific literature. You’re also encouraged to bring your favorite visualization to share with other participants and any visualizations you are already working on.

If this sounds like how you want to start your 2026, click the button below to learn more

Learn more!

Because a single workshop isn’t enough to put the ideas into practice, I will also be making myself available for one-on-one and group coaching sessions. If you are interested in these sessions, please reply to this email.

Last week I introduced you to a cool microbial ecology paper recently published in Nature Microbiology by Bakkeren and colleagues, “Strain displacement in microbiomes via ecological competition”. On Monday I provided a critique of Figure 2 from this paper. As you may recall, in last week’s newsletter I discussed panels f and g from the figure. I also recreated these panels in Wednesday’s livestream. Today I want to talk about how I’d make panels b through e:

They’re all the same basic panel each describing a different type of competition. What stands out to me about these panels is that they have a “cartoon” embedded in them to explain the panel’s experiment. I really thought this was slick. I especially liked how they used the same colors in the cartoon that they use for the points and the lines. It’s crystal clear to me that the red data is from the invading strain and the blue data is the resident strain. How would we make this in R? For the sake of conversation, let’s just think about panel c.

This is actually a scatter plot. We could use geom_point() to plot the data. The time point being sampled would be mapped to the x-axis, the density to the y-axis, whether the strain was the invader or the resident would be mapped to the color of the circle edge (symbol = 21), whether the strain was the wild type (WT) or the mutant (ΔsrlAEB) would be mapped to the fill color. As I mentioned in the critique video, I would actually prefer to use geom_jitter() to separate the points. It’s hard to tell but some the time points have 9 points and others 5. Jittering the data would make it easier to see the number of points. Another issue with the original plot is that the distance between the four time points is the same although the number of hours between 8 and 24 is not the same as between 24 and 48 or 48 and 72 hours. You could get the original appearance using scale_x_discrete(). I’d prefer to use scale_x_continuous(). Of course, because the y-axis is on a log10 scale, we need to use scale_y_log10().

How about the line through the points? One approach would be to create a separate data frame that has the median density at each time point and then use that as the data for a call to geom_line(). But that’s tedious. Instead, we need to learn about stat_summary(). With stat_summary() you can give it a fun argument that indicates the statistical summary to apply to the data - median - will work in our case. Then we need a geometry to represent the summary on the plot. We’ll use "line". This will get us a line connecting the various time points. We actually want to call stat_summary() before geom_point()/geom_jitter() so the line goes behind the points.

Looking at the y-axis you might notice some interesting numbers. The title and the text both have numbers in superscripts. We can get that using the sup HTML tag. For example, 10<sup>6</sup> will render as 10⁶ when we use element_markdown() from {ggtext}. You can use the label argument in scale_y_log10 to change 1e6 to 10<sup>6</sup>. Cool, eh?

Now that I’ve mentioned {ggtext}, let’s turn to the title of each panel. There are several packages that are useful for inserting images into {ggplot} figures. The easiest I’ve settled upon is to use {ggtext} with the <img> tag. In this approach, you could use labs() and set the title= argument to be the string you want with the <img> tag. Assuming we have the image stored as cartoon_c.png, I could use the following string to set the title for panel c:

labs(title = "Invader private nutrient No toxins ")

The height value will scale the height of the image in pixels, so finding the right size will require some fiddling. Of course, we’ll need to use plot.title = element_markdown() in the theme() function to render the HTML. Cool, eh?

I’m planning on building this out in Wednesday’s livestream (9AM Eastern), so be on the lookout for that video. While I’m talking about livestreams… Can I tell you how much I’m learning by doing these? In each of these a viewer will make a comment like, “Why don’t you do it this way?” or “If you do it this way, then you can do this”. In each of these cases, “this way” never occurred to me. I never would have tried it “this way” if I had been recording and editing videos like I was a year ago. If I’m learning, then I’m sure others are too!

Workshops

I'm pleased to be able to offer you one of three recent workshops! With each you'll get access to 18 hours of video content, my code, and other materials. Click the buttons below to learn more

minimalR Workshop

generalR Workshop

mothur Workshop

In case you missed it…

Here is a livestream that I published this week that relate to previous content from these newsletters. Enjoy!

Finally, if you would like to support the Riffomonas project financially, please consider becoming a patron through Patreon! There are multiple tiers and fun gifts for each. By no means do I expect people to become patrons, but if you need to be asked, there you go :)

I’ll talk to you more next week!

Pat

Riffomonas Professional Development

Do you want to up your data visualization designs in 2026?

Workshops

In case you missed it…

What is the most common form of "chartjunk" in the scientific literature?

Using R to generate a figure with microscopy images?

Learn to make your data analysis "tidy" and your code "tidy" too!