em-BRACING ggplot2 to recreate historic data visualizations


Hey folks,

I can’t tell you how much I’ve enjoyed recreating the “data portraits” from the collection of visualizations that WEB DuBois and his colleagues presented at the 1900 Paris Exposition​. You can find the entire collection of “data portraits” in a book assembled by Whitney Battle-Baptiste and Britt Rusert (here) or as a collection of plates through the Library of Congress (here).

Perhaps this isn’t so obvious to my non-US readers and viewers, but February is Black History month. In December or January, I had the idea to do a couple visuals for February to honor DuBois, his colleagues, and other great Black scientists of yesterday and today. When Executive Orders from the Trump Administration started going off the rails, I doubled down on the DuBois recreation videos. When all is said and done, I’ll have recreated 8 of the ~60 visuals on YouTube. I’m grateful to Battle-Baptiste and Rusert, Anthony Starks and Jason Forrest who have helped popularize efforts to recreate these visuals with modern tooling. I really hope I’ve done the visualizations justice. Please make sure you watch the great presentation by Starks and Forrest that was posted to YouTube in 2021.

Frankly, I’m pretty amazed that I’ve been able to recreate these visuals using only the functions loaded with the tidyverse metapackage and the showtext package for loading fonts. Just because I’ve been able to recreate the visuals using two library() function calls doesn’t mean that they’ve been easy to recreate! These have been some of the more challenging data visualizations I’ve made. Although the visuals are quite different from what I make for my research, this process has really helped me hone my problem solving skills when it comes to doing things with these tools that they weren’t designed for. Sadly, in this newsletter I’ll be describing the final data visualization that I’ll be including in this series.


Recreating fans, bullseyes, spirals, and other odd shapes in R has really taken a lot out of me! This week, I wanted to cover something I thought would be a little “simpler”. Check out this bar plot, which is Plate 9 from the collection.

Part of DuBois and his colleagues’ goal in going to Paris was to provide context to his European audience for the situation of Black Georgians and Americans in general. This visual shows the age distribution among Black Georgians relative to the French population. The French population was older than the Black Georgian population. Beyond the story there are a few interesting things about this plot

First, this is clearly a bar plot with the categories on the y-axis, the percent of the population on the x-axis, and the race/nationality used to set the color of the bars. This bar plot can be created using geom_col(). By default, geom_col() will create a stacked bar plot. So, using position = position_dodge() will set the two categories side by side for the same value on the y-axis. We can adjust the separation between the sets of bars using the width argument in geom_col().

Second, instead of including an x-axis, the percentages are embedded in the bars. This can be done with geom_text(). The x-axis position can be set by taking half of the total percentage. Also, the y-axis position can be set by re-using position = position_dodge(). One quirky thing about the text in this plot is that the black bars have white text and the gold bars have black text. Of course, this can be set by mapping the race/nationality to the color aesthetic. We can still have a thin black line around the golden bars by using color = "black" in geom_col().

Third, instead of having the legend on the right as we are accustomed to with ggplot2, this legend is directly below the title. We can pull this off with the legend.position argument in theme(). I suspect we’ll have to use a number of legend-related arguments to get the legend key to be a rectangle rather than a square and to get the separation between the two categories.

Finally, the hard part of this figure is the inclusion of the “{“ to group the pairs of bars for each age group. We might be tempted to use geom_text() and use large size. But that will just make a big, thick “{“ rather than a big, thin “{“. Because I can’t seem to go a week without converting between polar and Cartesian coordinates, can you see the circles - or parts of circles - in that character? The top part of the brace is the top left quadrant of a circle, the top middle part of the brace (or the “nose”) is the bottom right quadrant of a circle, the bottom middle part of the brace is the top right quadrant of a circle, and the bottom part of the brace is the bottom left quadrant of the circle. Seeing the brace as portions of a circle, we should be able to generate a function that will draw the brace for us. Then we can include the braces for each category. Clearly, I’m leaving out a lot of the details on this step! Think you can figure it out? :)

A number of DuBois’s other visualizations also use these braces, so I think it is worth learning how to use them. Of course, there’s a package that will do this for us, but where’s the adventure in that!? If you want some data to practice with here you go…


library(tidyverse)

tribble(
~category, ~negroes, ~france,
"AGES,\nUNDER 10", 30.1, 17.5,
"10-20", 26.1, 17.4,
"20-30", 17.3, 16.3,
"30-40", 10.6, 13.8,
"40-50", 6.8, 12.3,
"50-60", 4.6, 10.1,
"60-70", 2.9, 7.6,
"70 AND\nOVER", 1.6, 5)

Workshops

I'm pleased to be able to offer you one of three recent workshops! With each you'll get access to 18 hours of video content, my code, and other materials. Click the buttons below to learn more

In case you missed it…

Here are some videos that I published this week that relate to previous content from these newsletters. Enjoy!

video previewvideo preview

Finally, if you would like to support the Riffomonas project financially, please consider becoming a patron through Patreon! There are multiple tiers and fun gifts for each. By no means do I expect people to become patrons, but if you need to be asked, there you go :)

I’ll talk to you more next week!

Pat

Riffomonas Professional Development

Read more from Riffomonas Professional Development

Hey folks, This has been a busy week! I’ve been on campus teaching a 3 day, all day, R class. It’s been a while since I’ve done one of these live workshops off campus. If you’re interested in me coming to your campus, you coming to Michigan, or being in a Zoom-based workshop, please let me know! I really love being able to interact with you all in workshops. If your experience has been at all like my own the past month or so, your conversations have all had a tinge of anxiety about the...

Hey folks, I really hope you enjoyed the series of newsletters and videos of me recreating the visualizations presented by W.E.B. DuBois at the 1900 Paris Exposition. I can’t express how much I enjoyed making them. Some of them were pretty tricky and required a lot of work. But I think it was worth it! It definitely forced me to use some new-to-me tools like geom_polygon() and geom_sf(). Please let me know what you thought of the series! I wonder if there’d be any interest in a companion to...

Hey folks, I hope you have enjoyed the current series of newsletters and videos recreating “data portraits” from the WEB DuBois collection of visuals he showed at the 1900 Paris Exhibition. You can find the entire collection of “data portraits” in a book assembled by Whitney Battle-Baptiste and Britt Rusert (here) or as a collection of plates through the Library of Congress (here). I’ve really appreciated the positive feedback! These figures are pretty different from what we do in modern data...