Plotting the US job creation numbers (and revisions) with ggplot2


Hey folks,

Are you interested in uping your data visualisation skills? I’m rolling out a new program to help you improve the design of your data visualizations. This program will last 5 weeks starting at the beginning of September. Each session will be two hours long and include a discussion of data visualization principles followed by an opportunity to apply these ideas to your own visualizations. There will be no coding in this program so you can focus more on concepts than implementation. I believe that once you understand the concepts, you can use any tool - even a pencil and piece of paper - to implement your design. Click this button to learn more.


In recent weeks there’s been a kerfuffle about data coming out of the government. Specifically the US Bureau of Labor Statistics (BLS) revised its forecast of the number of jobs created in May and June 2025 by about 125,000 jobs per month. This led President Trump to fire the Chief of the BLS. Obviously, it’s critical to have trustworthy data from the government to understand the state of the economy, assess effectiveness of programs, and track the overall health of society. Political firings of people implementing SOPs is jarring.

Beyond the politics, the NY Times again had a visualization that I found interesting. There are a few things in this figure that I found interesting and that I am curious to try out in R with {ggplot2} and the rest of the tidyverse. You can get the data from the first link on this page from the BLS.

First, this plot is a stacked bar chart. I really like the use of negative space for the number of jobs that were over projected in May and June. I’d start by making sure the data are in a tidy format with a column for the date (first of each month month and year between July 2024 and July 2025), a column for the number of jobs, and a column to indicate whether the number of jobs is from firm estimate or over projection. With this structure, we could use ggplot() with geom_col() and map the date to the x-axis, the number of jobs to the y-axis, and the indicator column to the fill aesthetic. We might need to play with factors or the position argument to get the ordering of the indicator column correct for May and June 2025.

Second, I really liked the use of color. For the preceding months and years the firm numbers are in gray while the projected extra numbers are white. But for July 2025 the number is in orange. This would mean that we need three values in the indicator column. We could use scale_fill_manual() to map the specific colors we want to those indicator values. Alternatively, instead of using indicator values, we could specify the fill color directly in the tibble and use scale_fill_identity(). To make it clear that the extra number for May and June are white, they use a gray border around all of the data preceding July 2025. Here again, I might add a column to my tibble for the border color to be gray for months before July 2025 and orange for July 2025. Then we could use scale_color_identity() to use those colors.

Third, there’s a couple of instances of annotation in the plot. The first that catches my eyes is the italicized “REVISED DOWN” over the bars for May and June 2024. I’d likely do this with geom_label() making the background slightly transparent, removing the border, making the text italicized and gray. The second that catches my eyes is the black text that describes the numbers. Here again I’d use geom_label() but with regular black font. Another part of the annotation are the line and curve that connect these text elements to the data. I’d make the lines using geom_curve() to get the rounded appearance for the line going to May and June.

Finally, in classic New York Times fashion the y-axis labels are on top of the horizontal grid lines. It feels like it’s been a while since I’ve done one of these. I suspect I’d again use geom_text() to place the text on the grid lines. I’d use theme() function to remove the typical y-axis text, titles, and ticks. Although we could make use of scale_x_date() for the x-axis, I think it might be easier to treat the x-axis as categorical data and use scale_x_discrete() to put specific labels on every other month.

What do you think of this visualization? Did you notice anything that I’ve glossed over? How would you go about implementing that flourish?

Workshops

I'm pleased to be able to offer you one of three recent workshops! With each you'll get access to 18 hours of video content, my code, and other materials. Click the buttons below to learn more

In case you missed it…

Here is a livestream that I published this week that relate to previous content from these newsletters. Enjoy!

video preview

Finally, if you would like to support the Riffomonas project financially, please consider becoming a patron through Patreon! There are multiple tiers and fun gifts for each. By no means do I expect people to become patrons, but if you need to be asked, there you go :)

I’ll talk to you more next week!

Pat

Riffomonas Professional Development

Read more from Riffomonas Professional Development

Hey folks, As I mentioned last week, I’m exploring the possibility of holding live, in person, workshops again like I did before the pandemic. If this is something that interests you, please let me know. My thought would be to hold them at an affordable hotel near the Detroit airport (DTW). But, if you would like to host me to teach a workshop, I would be open to that as well. This week, I want to call your attention to a plot that I would not encourage you to make. This comes form “Targeted...

Hey folks! I’m hoping to host two workshops in March and April. The first would be a Zoom-based workshop on the principles of data visualization (I taught a version of this last month). This would be a code-free workshop and would run for about 3 hours. I don’t have a date yet. If you are interested, please reply to this email and let me know if there is a date and time in March that would work best for you. The second would be an in person 3 day workshop taught near the Detroit airport. I...

Hey folks, We had a lot of fun last week with my first workshop on the theory of data visualization! If this is something that you’d be interested in participating in let me know. At this point, I don’t have anything scheduled. So, if you have suggestions for days or times, please let me know This week I have a fun figure to share with you from a paper recently published in Nature Microbiology, titled, “Candida auris skin tropism and antifungal resistance are mediated by carbonic anhydrase...