Data visualization as a vaccination against ignorance


Hey folks,

I hope you’ve noticed that this newsletter and the YouTube channel have nearly caught up. At this point there’s a 10-day lag between when I post a newsletter describing a data visualization and when I post the recreation video. I could possibly push that to a 3-day lag, but I’d like people to have a chance to work through the code on their own before I give my solution. After having existential dread last week that I’d never find another good plot to share, it appears my cup runneth over :) I’m pretty excited to share what I’ve been collecting!

As I mentioned about a month ago, many in the US are bracing themselves for the prospect of Robert Kennedy serving as Secretary of the Health and Human Services. He’s been an outspoken opponent of vaccines. Combined with what many feel is an increase in “anti-vax” sentiment, there are fears that old diseases like measles, polio, or whooping cough might make a comeback.

This week, Francesca Paris, a data reporter at the New York Times published an article about vaccination rates across the US and how they’ve been falling since the “COVID-times”. Here’s the first graph in her article.

The take home message from this figure is that vaccination levels for measles, polio, and whooping cough hovered around 95% until the pandemic started. Then the rates declined. I’m sure there are many reasons why this happened and the article shares a few. For my family, I know it was nearly impossible to get our kids into the doctor’s office for well-child visits. Then no one was alerting us it was ok to come back. Then our providers dropped us because they hadn’t seen us in too long. Our story wasn’t unique. Combined with anti-science political winds around the COVID vaccine, it was a perfect storm for declining rates of vaccination.

Line plots are common in many fields and I was excited to see an attractive line plot that told an interesting story. What do you think of the plot? I’m repeating myself, but I really like the aesthetics of the NYT data journalism products. They’re very easy on the eyes. They use color for emphasis. The landmarks for the figure often fade into the background. There are good annotations telling you what is going on. This one is no exception. If you want to try to recreate this figure or any of the figures the article, you can get the raw data from the CDC as an MS Excel Spreadsheet through their website. A few things stood out to me about this figure.

First, it’s a line plot. We could draw the lines mapping the school year to the x-axis, the national vaccination rate to the y-axis, and color to the disease being vaccinated against and create the line with geom_line(). What’s interesting to me about this particular line plot is that have indicated where they have data by including a circle plotting symbol. If you look closely, there is a space between each line segment and the plotting symbol. When I see these types of plots, I rarely see that type of spacing. I would create this effect by using plotting symbol 21, which is a circle with a border. Instead of using a black border, I’d use a white border to give the effect of the spacing. The points could be added by running geom_point() after geom_line().

Second, instead of a separate legend, they label the lines directly. The labels for each of the diseases being vaccinated against are near the ends of the lines and are the same color as the line. Because the end of the lines for measles and polio are overlapping, they moved measles up and included a line with a 90-degree turn in it to label its line. I think all of this is pretty slick. I’d likely create the labels using geom_text(). The line for measles could probably be created using annotate() with geom = "segment". We might also be able to create it using a function from {ggrepel}. Of course, I’m more familiar with annotate() and so getting something to work might be easier than figuring out how to use something from {ggrepel}.

Third, in typical NYT fashion, they don’t have a y-axis line and they have the y-axis values on the corresponding grid line with the unit for the axis next to the top number (e.g., 95%). Something we could debate is that the range goes from 91 to 95% rather than from 0 to 95 or 100%. By narrowing the range, we see the important small changes in the data. The downside is that we lose the sense that the changes are quite small albeit in the same direction. An alternative would be to instead have a y-axis indicating the difference from the “Federal Measles Target”. Then we’d have an axis from about -3.0 to 0.5. That might be too abstract for a lay audience. But if you’re a “must include zero purist”, that’s how I would do it. We’ve seen how to create our own y-axis using annotate() in recent videos.

Fourth, I really like how they have indicated the target level of vaccination. Again, all the grid lines are a light gray color that really blends into the background. But the line at 95% is solid and black. That the label is in all capital letters drives home the message that this is the desired threshold. Because one line is different from the others, I’d likely create the black line using geom_hline() and the other grid lines using theme(). The text labelling the line is right justified (hjust = 1) and could be included using annotate() with geom = "text"..

Finally, earlier this week I posted a video recreating a bar plot from another NYT article. The “Libre Franklin” sans serifs font is still on my mind and I’d like another try at using {showtext} to use that font from Google fonts. As far as I can tell, all the text in this figure using the Franklin font.

If you scroll further down the article, you’ll see a few other plots are included. Which is your favorite? Which would you most like me to make a video of? Reply to this email and let me know!

For now, I’m leaning towards recreating this figure. We’ve done several of this type of plot in the past, but this has some unique features. See if you can think through how to recreate this figure on your own!

Workshops

I'm pleased to be able to offer you one of three recent workshops! With each you'll get access to 18 hours of video content, my code, and other materials. Click the buttons below to learn more

In case you missed it…

Here are some videos that I published this week that relate to previous content from these newsletters. Enjoy!

video previewvideo preview

Finally, if you would like to support the Riffomonas project financially, please consider becoming a patron through Patreon! There are multiple tiers and fun gifts for each. By no means do I expect people to become patrons, but if you need to be asked, there you go :)

I’ll talk to you more next week!

Pat

Riffomonas Professional Development

Read more from Riffomonas Professional Development

Hey folks, This has been a busy week! I’ve been on campus teaching a 3 day, all day, R class. It’s been a while since I’ve done one of these live workshops off campus. If you’re interested in me coming to your campus, you coming to Michigan, or being in a Zoom-based workshop, please let me know! I really love being able to interact with you all in workshops. If your experience has been at all like my own the past month or so, your conversations have all had a tinge of anxiety about the...

Hey folks, I really hope you enjoyed the series of newsletters and videos of me recreating the visualizations presented by W.E.B. DuBois at the 1900 Paris Exposition. I can’t express how much I enjoyed making them. Some of them were pretty tricky and required a lot of work. But I think it was worth it! It definitely forced me to use some new-to-me tools like geom_polygon() and geom_sf(). Please let me know what you thought of the series! I wonder if there’d be any interest in a companion to...

Hey folks, I can’t tell you how much I’ve enjoyed recreating the “data portraits” from the collection of visualizations that WEB DuBois and his colleagues presented at the 1900 Paris Exposition. You can find the entire collection of “data portraits” in a book assembled by Whitney Battle-Baptiste and Britt Rusert (here) or as a collection of plates through the Library of Congress (here). Perhaps this isn’t so obvious to my non-US readers and viewers, but February is Black History month. In...