Flooding you with opportunities to practice your data viz skills in R


Hey folks,

I’m really enjoying sharing with you my 30,000 foot view of how I would go about making figures that I find in the “wild”. Following up on these emails with a couple of related YouTube videos has been a lot of fun for me. Of course if you find any figures you like, send them my way - I love seeing what interests you all.

I was reminded recently though that not everyone feels enough confidence with their R and tidyverse skills to keep up. Sorry! Towards the bottom of this email I always include links to information about my workshops where I do deep dives on the tidyverse. These newsletters and videos are geared for people who have taken a workshop from me or someone else and want to take the next steps in learning R. You can also find the materials I teach from for free here and here. As you likely know by now, in these newsletters and videos, I talk about obscure arguments and try to demonstrate the more fundamental functions in different contexts.

I hope you’re enjoying these as much as me!

~~~

Earlier this fall there were a number of hurricanes and storms that whipped through the eastern United States causing pretty historic levels of flooding. Towards the end of October, Christopher Flavelle of the New York Times posted a daily newsletter briefing where he talked about “America’s Flooding Problem”. The newsletter had a number of amazing pictures. It also had this figure showing the rise in flooding events over the past 25 years.

What does this figure make you think about? What elements helped Flavelle tell his story about the rise in flooding? If you were to try to recreate this in R, how would you get started? What elements stick out to you as being atypical for R plots? What would you struggle to reimplement? Think about these questions before you read further.

I really like the simple design of this figure. They’ve done a good job of stripping out a lot of unnecessary distractions. I like how the 2024 bar is a more saturated blue color than the preceding years and that they annotate the bar with colored text that tells the story. I started wondering what happened in 2019 to have more than 50 flooding-related disaster declarations. I wondered what the current number is for 2024 two months later. I think an effective visualization does a good job of answering a single question and causes the viewer to ask more questions. This plot puts you in the story they are trying to tell.

OK, how would we make this in R?

First off, it’s a bar plot. I’ll put some “guesstimate” data below that should allow us to roughly reproduce their figure. The data frame has columns year and declarations, which will be mapped to the x and y aesthetics, respectively. I’d use geom_col() to generate the bar plot. It also has a title and caption that we could set with labs().

Second, the use of color is pretty effective. But we don’t have a column to map to the color aesthetic. To pull this off, I’d create a highlight column that would be TRUE for 2024 and FALSE for every other year. Then I’d map highlight to the color aesthetic. Of course, the default colors won’t match so I’d need to use scale_fill_manual() to alter the colors. I’d use a color dropper tool to figure out the precise RGB codes for the two colors. I won’t need the legend so I’d drop the legend using show.legend = FALSE in geom_col().

Third, the text annotation is pretty slick. I’d likely use geom_richtext() or geom_textbox() from the awesome {ggtext} package. These are a lot like geom_text() and geom_label() but they allows you to include html and markdown to alter the appearance of the text. The “66 declarations” would need to be bolded and turned to the same shade of blue as the bar while the “this year so far” would be black and in a plain font face. I’m not sure which function I’d prefer, so that might require some experimentation. I’d add a short line segment connecting the text to the bar using geom_segment().

Fourth, we’ve seen those y-axis labels in a previous figure. The plot includes the major grid lines with the value of the grid line sitting on the line. The top grid line also indicates what’s being measured on the y-axis. As I’ve done in the past, I could remove all of the y-axis ornamentation and add the values to the grid lines with annotate(). Using annotate() would probably be easier than using scale_y_continuous() because I’d have to mess with the margin, hjust, and vjust parameters in axis.text.y for the theme() function and I’d have to remove the label for the 0 value in scale_y_continuous(). Yeah, annotate() is looking pretty good here.

Fifth, all of the lines in the figure - the grid lines, x-axis line, x-axis tick marks - are a subtle light gray. They’re also all the same thickness. We would be able to modify their appearance using element_line() in the theme function. Because all of the line elements look the same, I wonder if I could just use line = element_line(color = "lightgray") rather than drilling down to the individual line elements in the theme.

Finally, because I can’t help myself, I’d want to play with matching the fonts. The title is a serif font and the other text is sans serif. By highlighting the text in the article, I see that the serif font is NYT’s own Cheltenham font whereas the sans serif font is their version of Franklin. Doing some sleuthing in google fonts, it looks like Domine and Libre Franklin would be good stand ins. We can implement these fonts with tools from the {showtext} package.

Here’s some data to play with. I’ve done my best to match the values from the figure, but it probably isn’t perfect…


flooding <- tibble(
year = 2000:2024,
declarations = c(2, 5, 4, 0, 1, 2, 2, 1, 3, 3, 9,
25, 3, 17, 10, 9, 18, 12, 17, 53,
21, 10, 14, 42, 66)
)

Workshops

I'm pleased to be able to offer you one of three recent workshops! With each you'll get access to 18 hours of video content, my code, and other materials. Click the buttons below to learn more

In case you missed it…

Here are some videos that I published this week that relate to previous content from these newsletters. Enjoy!

video previewvideo preview

Finally, if you would like to support the Riffomonas project financially, please consider becoming a patron through Patreon! There are multiple tiers and fun gifts for each. By no means do I expect people to become patrons, but if you need to be asked, there you go :)

I’ll talk to you more next week!

Pat

Riffomonas Professional Development

Read more from Riffomonas Professional Development

In case you missed it, I have nine kids ranging in age from 23 to 7 that my wife homeschools. They’re a riot. Each of them has to find a way to be different from all of the others. This makes for some real characters. Let me introduce you to Peter. This week, Peter, who is 11, has been working on a times table. You may remember these from when you were a kid. Say you want to know what 7 times 8 is (this was always my hardest “times” to remember). You take your finger down the rows to the...

Hey folks, We’re still slogging our way through Thanksgiving leftovers. As time passes from last Thursday, there’s a fine line between setting a good example about not wasting food and setting a bad example by getting every food poisoning! Speaking of eating, our teeth are pretty important, don’t you think? In the US, Trump’s expected head for the Department of Health and Human Services has a number of interesting views about health. One example is that its a bad idea to spike our drinking...

Hey folks, Next week is Thanksgiving here in the US and I’ll skip sending you another newsletter. In exchange, you’ll get three videos on YouTube inspired by a newsletter post from October talking about a descending bar plot with a pattern in one of the bars. Before you thank me, you might want to check out today’s newsletter🤣! I’ve always enjoyed the old 538’s articles and appreciated the data centric point of view of its founder Nate Silver. He has a Substack newsletter, “Silver Bulletin”,...