|
Hey folks, I’m at the end of a day after I pulled an all-nighter trying to hit a grant proposal deadline. I don’t recall ever doing this in college. I seem to pull an all-nighter every five years or so. I’m too old for this! Anyway, the proposal is in and now I’m ready to move on to fun things… like talking to you about visualizing data! A few years back Whitney Battle-Baptiste and Britt Rusert put together an amazing collection of visualizations by WEB DuBois that he presented at the 1900 Paris Exposition. The book is called “W.E.B. Du Bois’s Data Portraits: Visualizing Black America”. It’s probably the most enlightening $20 you’ll spend. You can find the originals here at the Library of Congresses website.If you aren’t convinced, I’d encourage you to check out this video on the collection and an effort to recreate the figures using modern tools. I think this is the slide deck. Here’s the GitHub repository of Anthony Starks’s effort to recreate the visuals using a tool called You might look at the visuals and think… whaaa? But really, spend some time with them and learn about them. One of the things that impresses me about this collection of visuals is that they were hand made. No R. No Python. No Excel. No Tableau. No fancy d3.js package. The artistry of these images and the unconventional approach to visualizing data adds an intriguing layer to the story DuBois was telling his French audience about the plight of African Americans in the US in 1900. Consider this figure, which is plate 12 from the collection. If you’re like me, it might have taken you a minute to wrap your head around this simple, but profound figure. The plot shows that between 0.7 and 1.7% of African Americans living in the US from 1790 to 1860 were free - the rest were enslaved. By 1870 they were all freed. I like how the red representing the free individuals overtakes the darkness of slavery. Perhaps we could read more into the symbolism, but I’ll hold off for now. Let’s think about how we might make this in R! First, this is an area plot. Also, it appears to have a white line separating the black and red areas. We could likely generate the basics of the plot using the Second, I notice that the left side of the x-axis is not so much broken as “ripped”. Doing some digging into Third, I notice the x-axis labels are at the top of the plot from 3% on the left to 0% on the right. We saw in a recent video that Fourth, the years are listed on the left y-axis and the “Percent of free negroes” is listed on the right axis. I can think of two ways to do this. First, the years can be included using Finally, the Anthony Starks developed a DuBois-ian style guide. If you download that pdf and look at the end you’ll find a set of suggested fonts and hex codes for the colors that DuBois used. DuBois of course used his own hand to write the text in this figure, but a special DU BOIS font has been created by Vocal Type. It’s a bit pricey to download a desktop version of the font, but maybe they won’t mind me using the trial version to make some figures for you all? Alternatively, google fonts doesn’t have a good match, but perhaps “Roboto Mono” would be ok. Try you hands at generating this figure on your own in R. If you’re feeling adventurous and want to represent the same data differently, check out Panel 51. Here’s some data…
|
Hey folks, It has been great to see the high level of engagement with my weekly critique videos on YouTube. I have really enjoyed making them and have learned a lot about current practices in data visualization. The one problem with these videos is that they’re a bit like an autopsy. We can figure out what went well or what didn’t work in a published figure. But we can’t do much to improve the published figure. What if we could do critiques before submitting our papers, preparing a...
Hey folks, This week I want to share with you a figure that resembles many a type of figure that I see in a lot of genomics papers. I’d consider it a data visualization meme - kind of like how you’re “required” to have a stacked bar plot if you’re doing microbiome research or a dynamite plot if you’re publishing in Nature :) This figure was included in the paper, “Impact of intensive control on malaria population genomics under elimination settings in Southeast Asia” that was published...
Hey folks! I hope you enjoyed last week’s series on the radial volcano plot (newsletter, critique video, livestream). I think it did a good job of illustrating the various reasons I think it’s valuable to recreate figures, even if we don’t like how they display the data. Something I didn’t really emphasize in last week’s newsletter was that by recreating a figure, we can make sure that the data are legit. I’m surprised by the number of signals I’ve been finding where authors using tools like...