Pseudo-waffle plots from LA from the Washington Post


Hey folks!

Do you ever get that feeling where you’re scared to try something? But then you do it anyway… and it turns out way better than you expected? Well that was me on Wednesday morning. I ran my first livestream on YouTube recreating a ridgeline plot from Our World in Data showing the US baby boom. I wrote about it here in the newsletter back in May. The full session was about 2.5 hours. YouTube tells me that 272 people popped in at some point during the session. To be honest, I really only expected 2 or 3 people and that there would be times when no one would be watching me. Thanks to all who tuned in!

I would love to get feedback from anyone who was watching. Honestly, I’ve never watched a livestream before. If you know of any great livestreams, please send them my way so I can learn what makes them more effective. Something I already noticed was that I got a question about something I wasn’t planning on discussing - how to put a logo in the visual - and we spent some time doing that. It was scary to go off the script at the end there, but fun!

The next one will be on Wednesday, June 18th at 9:00 am (Eastern US). I plan on doing a makeover of this plot as a heatmap based on another heatmap I made showing deaths to drug overdoses.


It should surprise no one that Americans are divided on nearly every issue. Recently protests in Los Angeles and the response from the Trump administration have been dominating the news. The Washington Post surveyed 1,000 people to gauge their opinion of the protests and the response (free version). In my opinion, the results were fairly predictable. Republicans support Trump and oppose the protestors. Democrats oppose Trump and support the protestors.

This article shows two types of plots: horizontal stacked bar charts and something like a waffle plot. They also share the free text responses from survey participants. Let’s start with the horizontal stacked bar charts.

I am sharing this plot, because I want to highlight something good about it. They have three categories - support, unsure, and oppose. They put the unsure category in the middle and the other categories on the left and right. This is ideal. Why? Well, this layout makes it much easier to compare the level of support because the five categories are anchored on the left side of the plot. You don’t need to read the numbers to see that people in California support Trump less than those in other states. Similarly, the layout also makes it easy to see the level of opposition because the orange rectangles are anchored at the right side of the plot. The only category that’s hard to interpret is the unsure because it has no anchor point. At the same time, that category isn’t all that interesting.

Quickly, I have a few ideas of how I’d make this plot. I would use geom_col() with the percent on the x-axis and the category on the y-axis. Next, I’d remove all the axis information. Then I would embed the percenteages with geom_text(). I’d create the spacing between the categories and label each category using facet_wrap() and left justify the title for each facet. Finally, the legend would be moved to the top left using the legend position arguments in theme(). It would be fun to play with having different border and fill colors for each segment of the bars.

The waffle plot was what really caught my eye in this article.

It’s not quite a waffle plot because they pull apart the three categories rather than putting them together in a single grid. Perhaps we can think of it as three waffle plots. With that perspective, it’s an interesting challenge to think about how we’d create the final row for each category, which doesn’t always verticaly lign up with the rest of the points in the grid. I’d likely make each grid using geom_point() after figuring out how to create the x and y-axis positions for each point. Then the grids could be arrayed by again using facet_wrap(). I’d embed the category and the percent in the facet title. Because the category name is bolded and the percent is in a regular font, I’d likely use element_markdown() from the ggtext package to implement that.

There’s a few things I’m less sure of about this plot. Frist, I don’t know that I can recreate the bubble around the title. The element_textbox() function from ggtext allows you to have rounded corners around text. But, this has a 90 degree corner in the lower left corner of the title. I’ll have to think about how to impelement that look. Second, the plotting symbols for both plots aren’t exactly circles. They look like hand drawn dots. I’m sure there’s a way to do something like this. After all, there are packages for adding emojis, cats, and even Bernie Sanders to scatter plots. Finally, the Washington Post included their logo in the lower left corner of the title. I did a bit of a hack in the livestream showing one way to add a logo using functions from the cowplot pakage. I didn’t really like that. Instead, I think I’d try again using functions from the ggimage package. Naturally, looking briefly at that package’s documentation, I see that it will let me import my own PNG files to make plotting symbols too. Looks like it’s time to dig into ggimage!

Workshops

I'm pleased to be able to offer you one of three recent workshops! With each you'll get access to 18 hours of video content, my code, and other materials. Click the buttons below to learn more

In case you missed it…

Here is a livestream that I published this week that relate to previous content from these newsletters. Enjoy!

video preview

Finally, if you would like to support the Riffomonas project financially, please consider becoming a patron through Patreon! There are multiple tiers and fun gifts for each. By no means do I expect people to become patrons, but if you need to be asked, there you go :)

I’ll talk to you more next week!

Pat

Riffomonas Professional Development

Read more from Riffomonas Professional Development

Hey folks, Did you know that you can do statistics in R? HA! Of course it is. As the first sentence of its Wikipedia entry says, “R is a programming language for statistical computing and data visualization”. I rarely discuss using R for statistical analysis and focus far more attention on the data visualization power of R. This week, I’d like to share a set of panels from a figure in a paper recently published in Nature, “Lymph node environment drives FSP1 targetability in metastasizing...

Hey folks, I’ve really enjoyed the flow of combining these newsletters with a Monday critique video, a Wednesday recreation video, and occasionally a Friday remake video. A few weeks in, I feel pretty good about our ability to engage in constructive critiques. Of course, we have to train ourselves (myself included) to use those tools and not just resort to immediate and emotional responses - “I hate that plot”. We need to engage, get in the head of the original creator, and try to understand...

Hey folks! I’m appreciating the positive feedback on Monday critique videos. They’re a lot of fun to think through and make. I think I might start looking at figures that are drawn from the scientific literature since many of you found out about me from my science work. Let me know if there are plots or practices that you’d like to see me talk about. I’ll see if I can work them into the queue. Also, if you’re working on developing figures for a presentation, poster, or paper and would like to...