What is the value of recreating visuals you don't like?

Hey folks!

Before launching into this week’s visualization, I’m looking for a bit of feedback. Since November, I’ve settled into a new routine with this newsletter and the YouTube channel. Each week this newsletter introduces a visualization at a 30,000 ft view or discusses a specific topic in some depth (example). The following Monday I post a video critiquing the visualization (example). Then on Wednesday (or Tuesday like this past week), I livestream a video where I recreate the visualization and refactor or modify the visualization based on Monday’s critique (example). This is working well for me. I’m honestly surprised that I’m able to find new things each week in Nature and its upper echelon journals without too much effort. How is this flow working for you? Could you reply to this email and let me know?

I’m curious what you think of the livestreams. The metrics YouTube gives me for livestreams are not as strong as they were for my recorded and edited videos. But, I honestly don’t have the time to produce the more polished videos. What could I do to make the livestreams more effective and engaging? A question I asked in the community forum was whether people wanted me to only do the refactoring or if it was ok to do both the recreation and refactoring even if it was clear I didn’t like the plot. The feedback I received was 3 to 1 in favor of doing both rather than just the refactoring. What do you think?

Here’s my take. I think recreating the plot has value even if I don’t like it. There are two main reasons. As an example, consider the stacked bar plot I discussed last week.

First, recreating a visualization forces me to do things that I wouldn’t normally do. For example in last week’s visualization the authors were able to create what were effectively section headings within the legend for taxa that stained Gram-positive or -negative. If I went directly to the line plots without recreating the stacked bar plot, I wouldn’t have had the chance to think about how to recreate that look. I think this blew people’s brains when I did this on the livestream.

The second reason is that recreating a visualization tells me a lot about the data and the approach the investigators took to creating the plot. For example, in the stacked bar plot example, it was only by digging into the actual data that I was able to see that there were actually other minor populations not shown in the legend. It also helped me notice they were doing weird things with pooling time points in Figure 5. Even if I never publish a stacked bar plot of my own data I was able to learn a lot by recreating this one. Do you find my logic compelling? Let me know!

This leads in nicely to this week’s visualization. I’ll show my cards right away and tell you I’m not a fan of this set of panels from the paper, “Specialized RNA decay fine-tunes monogenic antigen expression in Trypanosoma brucei” published earlier this week in Nature Microbiology. I’ll say more on Monday, but it’s effectively a volcano plot in a polar coordinate system (WHY!?!?).

Even if I never use coord_radial() to circularize my data visualization, there are a number of things in this set of panels I’m curious how to achieve. For example, can I use geom_hline() to make the dashed circle indicate the significance threshold? Or could the arrow in the lower right corner of the panels be drawn using geom_segment() with an arrow head?

The thing I’m most excited to figure out is the legend. You’ll notice that they have three colored circles under the title of “VEX known interactors” and the two other categories have a single circle. How would we recreate this look?

I think it’s actually a lot like the legend in last week’s visualization. As a reminder, here it is:

I recreated it by using an aesthetic (e.g., alpha) in addition to fill. By using the guides_legend() function and the override.aes argument within scales_fill_manual() and scales_alpha_manual(), I could create two legends for the two types of Gram staining.

For this week’s legend, imagine if we had taken the same legend we had last week and replaced the labels argument in scales_fill_manual() and scales_alpha_manual() with a blank string "". That would remove the labels. Now if we used legend.position = "bottom" within theme() the legend would lay horizontally on the bottom of the plot. The title would be on the left. Well to get the title on the right of each set of colors we could use legend.title.position = "right" within theme() to get the title for each legend on the right side of each set of symbols. Cool, eh?

The other added wrinkle here is that we effectively have three legends. For the radial volcano plot, I’ll likely use the color, fill, and alpha aesthetics. What do you think?

I’m not sure when I’d want to add this type of structure to my legends. But these types of recreation exercises allow me to do two things. First, they force me to do something that I wouldn’t naturally think of doing. If I jumped directly to the refactoring, this design element wouldn’t be necessary and I’d easily blow off doing it. Second, it forces me to do some problem-solving using the powerful tools I have at my disposal. This reminds me of how I was bothered by the bleed through of a grid line behind a dashed line that I had in a recent visualization. My solution was to put a thick white line down followed by a thin dashed line. I noticed that I still had bleed through of the dashed line from behind my jittered points because I was using alpha = 0.5. An insightful viewer suggested I repeat the same approach I used with the line, but with using a layer of white points above the background lines but below the data I was trying to show. Exactly.

Workshops

I'm pleased to be able to offer you one of three recent workshops! With each you'll get access to 18 hours of video content, my code, and other materials. Click the buttons below to learn more

minimalR Workshop

generalR Workshop

mothur Workshop

In case you missed it…

Here is a livestream that I published this week that relate to previous content from these newsletters. Enjoy!

Finally, if you would like to support the Riffomonas project financially, please consider becoming a patron through Patreon! There are multiple tiers and fun gifts for each. By no means do I expect people to become patrons, but if you need to be asked, there you go :)

I’ll talk to you more next week!

Pat

Riffomonas Professional Development

What is the value of recreating visuals you don't like?

Workshops

In case you missed it…

What is causing the large number of figures in modern papers?

Creating a dendrogram that spans multiple facets

Is this data visualization AI or human-generated slop?