Hey folks, Next week is Thanksgiving here in the US and I’ll skip sending you another newsletter. In exchange, you’ll get three videos on YouTube inspired by a newsletter post from October talking about a descending bar plot with a pattern in one of the bars. Before you thank me, you might want to check out today’s newsletter🤣! I’ve always enjoyed the old 538’s articles and appreciated the data centric point of view of its founder Nate Silver. He has a Substack newsletter, “Silver Bulletin”, that is very good. I’m too cheap to pay for a subscription, so I settle for the bread crumbs he includes on the free subscription. Last night I received his latest article, Hopium comes at a high price. The article is part of a debrief on the election and the state of polling and predictive models like his. His contention is that polls continues to underestimate Trump’s numbers, but within the margin of error of those polls. Regardless of what you think of Trump or Silver’s analysis, I was captivated by the visual that he included in the newsletter. As always, I encourage you to ask some questions about any plot you find to help you develop your taste and and think through how you would recreate elements of a plot. What type of plot is this? Aside from the data story, what is interesting about this figure? What do you like about it? What don’t you like about it? Can you outline the steps you would take to generate the figure? What are some of the steps you aren’t sure about and would like to learn? This plot was eerily reminiscent of a plot that I made back in 2021 showing the likelihood of people getting the COVID-19 vaccine at different times by country. I called this plot a “dumbbell” or “barbell” plot because for each entity (e.g., state or country) there is a ball connected by a line - it looks like a dumbbell. You might recall another set of videos I made recently based on paired data where I made a scatter plot and a slope plot inspired by sentiments of farmers and non-farmers in Sweden. A dumbbell plot is another way to show paired data for a handful of entities. If I were asked to recreate Silver’s figure, I’d expect to get a data frame with three columns -
At a basic level, a dumbbell plot can be made with with a combination of Let’s start with the handles. Using my What about the “bells”? For those, I need all of the polling data in a single column. I’d need to generate a second data frame using The labels are a bit more tricky. I’d use I was also struck by the “legend” across the top indicating the white point is the polling average margin and the green the actual margin. I’d probably use a few The axis labels also have some cool things going on. The x-axis text is a pretty slick way of embedding who was favored to the left and right of the black line at zero. I’d use Finally, the plot has vertical grid lines that are grey. There’s also one that is black at zero. We could do one or the other in the This figure shows the 7 “battleground” states from the election. Because of how our elections work, it’s the state one wins that matters, not the number of votes they get overall. So, although Harris won California by 21%, she still got the same number of electoral votes as if she had only one it by 1%. Ditto for Trump and Texas. Regardless, it would be interesting to see these types of data for the 43 other states. Beyond being more complete, I’m interested in this to whether the same ~2.5 percentage point difference holds up regardless of the state. Maybe I’ll see if I can track that data down between now and when I produce the remake video, likely in January. I’ll award bonus points if anyone does that for me :)
|
Hey folks, Are you looking for more personalized support and coaching to help you develop your data analysis skills? Are you looking for help in leading a data science team where your folks aren’t super proficient in analyzing data? Let me know what you’re looking for and we can discuss how I might be able to help you. Unfortunately, this wouldn’t be a free service. But, I’m confident I can help you get over the challenges that are keeping you from creating data analyses and visualizations...
Hey folks, We had another great livestream on Wednesday building a figure from the Washington Post. I talked about this plot last month in the newsletter as being a faceted waffle plot. We had a lot of fun building the figure! I didn’t think we’d get to it, but we even came up with a clever approach to making the non-uniform circles to depict each response to the WP’s survey. You’ll have to watch the livestream to see how we did it. I have really enjoyed the interaction with the people who...
Hey folks, I’ve now produced three livestream videos. What do you think? Do you watch them live or watch them later? Or are they too long? I’m looking for honest feedback! I have to admit that if I hadn’t livestreamed these videos, they would not have been produced. It’s nice that I can more or less record and post without any editing. This is still a bit of an experiment. I think fewer people are watching the episodes which makes me worry that this might be an overall step backwards for you...