Tracking trends in poverty with ggplot2


Hey folks,

I hope you all are doing well as we settle in to 2025. Here in southeastern Michigan winter has settled in. Our high temperatures have been below freezing for the past week and it looks like they will be for at least the next 10 days. Brrrr. Going outside reminds me to be grateful that our furnace works and that we’re able to keep it running. It’s easy to take that for granted. Perhaps this is what people had in mind when they made January, “National Poverty in America Awareness Month”.

Looking through the US Census Bureau’s poverty report from last January, I was struck by their Figure 1. The data has a column for the year, the number of people in poverty, and the percentage of people in poverty.

We can see that the “War on Poverty”, which started in 1964, continued the downward trend in the rate of poverty. Unfortunately, the rate has oscillated between 10 and 15% for the past 50 years without a further reduction. Beyond the stories the figure tells, I thought there were a number of interesting components to this figure that would be fun to consider in R.

First, this is clearly a pair of line plots with the top panel showing the number of people and the bottom panel showing the fraction of people living in poverty by year. We could do this with geom_line() and facet_grid() or facet_wrap(). I commend the developers for not trying to represent this as a single plot with a double y-axis. Those are generally considered confusing, which makes them a bad practice.

Second, of course, the top and bottom panels are different colors. But if we’re going to use facet_grid() or facet_wrap() and map a single column to the y-axis then we would need to use pivot_longer() to get the number and rate of poverty to be in the same column. We’d then have a column indicating the “type” of poverty. Something I noticed when I looked closer at the plots was that the lines between 2013 and 2017 are a different shade of the same color. Looking at the footnotes we see that three different methodologies were used to measure the level of poverty. To get the different colors we could modify the type of poverty column to indicate the methodology and then use scale_color_manual() to group and color the lines according to those types and methods. Hopefully, that makes sense. We could also use patchwork, which might make things a bit simpler. Based on my exploration of both approaches this week I am more of a fan of using facet_*() when the two plots share an axis (see the links to the videos below).

Third, I noticed that they highlighted when recessions have occurred for the past 65 years. I’m not sure what I’m supposed to see linking recessions to poverty, but they included the information. We can get those dates from Wikipedia. I’d likely generate those light blue rectangles using geom_polygon() or geom_area() - two geoms I’m not very familiar with! A subtle point I noticed was that the grid lines actually go over the top of the recession rectangles. This tells me that I’d likely need to use geom_hline() rather than the panel.gridlines.major.y argument from theme() to draw those; traditional grid lines controlled with theme() go behind the geoms.

Fourth, I like how the authors put the number and rate for 2022 in the right-hand margin of the plot. We could pull this off with geom_text() or annotate() along with coord_cartesian(clip = "off") to prevent the clipping of the text outside the plotting window. I notice they tell us what is being plotted in the middle of the panels and the units are on the top left of each panel. I probably would have combined those annotations to simplify things. If I recreated this figure with facets, I’d need to left justify the strip text in the theme function.

Fifth, there are several things that we’ve seen recently that we could reimplement here. A few weeks ago we saw a legend with a single variable in it. This figure also has a single variable legend. I think it is useful here. Having the glyph for the fill color to indicate that the color represents when recessions occurred is pretty nice. Also, I’m pretty sure they’re using Libre Franklin as their sans serif font. We could use that font with help from the {showtext} package. Of course, we could also use {ggtext}’s element_textbox_simple() function to create a two line title that has two font faces and two colors. I love{ggtext}!

What don’t I like about this figure? Well, they have a broken y-axis for the number of people in poverty. I think that’s deceiving and not necessary. I’d either leave it “unbroken” or start the y-axis with 20 million. The only time you really need to include zero on the y-axis is for a bar plot. I also find it really challenging to vertically line things up between the two panels. I think vertical grid lines would make things busy. But what if we put major tick marks every five years and shorter tick marks on the individual years? Or what if we dropped them entirely? Finally, I notice that the last year on the x-axis is 2022, which makes sense because it’s the final year. But that creates a weird gap between 2015 and 2022. With longer tick marks every five years, I think it would then become obvious from the smaller tick marks that the plot ends in 2022. Plus it would be in the title.

Let me know what you like or don’t like about this figure! Looking at the other figures in the report, which would you like to see me recreate with {ggplot2}?

Workshops

I'm pleased to be able to offer you one of three recent workshops! With each you'll get access to 18 hours of video content, my code, and other materials. Click the buttons below to learn more

In case you missed it…

Here are some videos that I published this week that relate to previous content from these newsletters. Enjoy!

video previewvideo preview

Finally, if you would like to support the Riffomonas project financially, please consider becoming a patron through Patreon! There are multiple tiers and fun gifts for each. By no means do I expect people to become patrons, but if you need to be asked, there you go :)

I’ll talk to you more next week!

Pat

Riffomonas Professional Development

Read more from Riffomonas Professional Development

Hey folks, This has been a busy week! I’ve been on campus teaching a 3 day, all day, R class. It’s been a while since I’ve done one of these live workshops off campus. If you’re interested in me coming to your campus, you coming to Michigan, or being in a Zoom-based workshop, please let me know! I really love being able to interact with you all in workshops. If your experience has been at all like my own the past month or so, your conversations have all had a tinge of anxiety about the...

Hey folks, I really hope you enjoyed the series of newsletters and videos of me recreating the visualizations presented by W.E.B. DuBois at the 1900 Paris Exposition. I can’t express how much I enjoyed making them. Some of them were pretty tricky and required a lot of work. But I think it was worth it! It definitely forced me to use some new-to-me tools like geom_polygon() and geom_sf(). Please let me know what you thought of the series! I wonder if there’d be any interest in a companion to...

Hey folks, I can’t tell you how much I’ve enjoyed recreating the “data portraits” from the collection of visualizations that WEB DuBois and his colleagues presented at the 1900 Paris Exposition. You can find the entire collection of “data portraits” in a book assembled by Whitney Battle-Baptiste and Britt Rusert (here) or as a collection of plates through the Library of Congress (here). Perhaps this isn’t so obvious to my non-US readers and viewers, but February is Black History month. In...