Hey folks, If you’re interested in participating in a 1-day (6 hours) data visualization workshop, you’re running out of time to register. I’ll be teaching this workshop on May 9th. I will cover an introduction to the ggplot2 package and will assume no prior R knowledge. My goal is to help you to understand the ggplot2 framework and begin to apply it to make some interesting and compelling visualizations. After this workshop, you should be able to learn more advanced topics on your own. You can learn more and register by clicking the button below. Feel free to email me if you have any questions.
I recently got an interesting question to one of my videos: Another great video—thank you! I’m curious: do you think it’s more effective to recreate existing plots or to design one’s own in order to learn? For the past 9 months or so I’ve been taking the “recreating existing plots” approach. My pitch to you all was along the lines of “recreating masters”. Artists learn a lot by recreating original paintings to hone their technique and learn from past masters. There’s a lot of this type of content on YouTube whether it’s recreating music, movie scenes, or writing styles. What have I learned by doing these? First, there have been a number of commands I’ve learned that have been in Looking forward, I really need to push myself to recreate plots using visualizations I’m not so familiar with. Maps are a big deal. If you look back through my recent videos, you’ll find one map. Part of the problem is that I’ve become more pressed for time and don’t have the ability to invest so much time in learning how to remake plots. But I really want to do this. One thing I thought I’d be able to do is show you all how to recreate scientific plots that we find in papers. The reality is that (in my opinion) most of these plots are pretty dreadful or boring. How many times can I bring myself to recreate a plot originally made in Prism about something that only a handful of people care about? It gets stale. Recreating scientific plots also becomes utilitarian for my audience. People start asking for very specific types of visualizations that, again, a small fraction of the audience cares about. Everyone is generally interested in scatter plots, line plots, bar plots, heatmaps, and maps depicting data from politics, the economy, sports, or entertainment. I hate the utilitarian mindset. If you can’t see how to take a heat map depicting deaths by overdoses by age and year and adapt it to gene expression data, then I’m afraid I can’t help you. :fire: Earlier in this journey, I would recreate a plot for the first video of the week based on the logic I’d lay out in these newsletter postings. Then the second video of the week would either show a different way of making the same plot or do a makeover on the plot. This second video gets to the second option posed in the question above. Again, I’ve needed to spend time working on other things and now even a video a week is challenging. Needless to say, it’s become hard to do makeovers. I’m afraid that when I make plots “from scratch” they all kind of look alike. Perhaps I’m creative, but I’m creative within the boundaries that I’m used to working. It’s hard to try new things when I don’t know what new things to try. I get stuck in a rut of the types of visuals I make. I make a lot of jitter plots and box plots; rarely do I make any line plots. I rarely manipulate the theming - perhaps I shouldn’t need to? It’s hard to practice and expand your skills when you’re always making the same type of plot. One benefit of recreating visuals is that I can store ideas away in my brain for future visuals I make from scratch. For example, I now find axis tick marks to be extraneous. I also have warmed up to horizontal grid lines with the y-axis text on the grid line. I’ve fallen in love with “Libre Franklin”. Don’t worry, I still love “dodgerblue” as a color. But the question he posed is an interesting - how could I take what I’ve been learning to represent data? Last week you may recall that I talked about an interesting visualization that I found in an article about the baby boom on the “Our World in Data” website. Back in February, I recreated a heatmap depicting drug overdose deaths. I’ve started wondering whether I could take the ideas I learned making the heatmap and apply it to the baby boom data. With the baby boom data, the y-axis showed the birth year of women, the x-axis showed her age, and the height of the histogram and its fill color was the average fertility rate. I thought it was hard to see exactly where the baby boom occurred, the relative height of the histograms, and the broader distribution of fertility rate across age in more modern times. I’m pretty confident all of this can be resolved with a heatmap. How would we overcome these with a heatmap? I’d make the x-axis the year - not the birth year, but the actual year. The y-axis would be the age of the women, the fill color would be the fertility rate. Then, like the drug overdose heatmap, I’d overlay parallel lines showing the cohort of women who were part of the baby boom. I’d love to hear how you’ve been taking what you’ve learned from this series of videos and applied them to your own visualizations!
|
Hey folks! As I’m writing this newsletter the US government is in shutdown mode with no clear signs that things will get going anytime soon. I’ll withhold my own political take except to say that my family has been running without an official budget for about 25 years. I don’t recommend it, but we know basically how much money goes to our mortgage, insurance, groceries, charities, etc. and how much money we generally have left over. Somehow we still are able to spend money on living a pretty...
Hey folks! This week I have a figure for you from the New York Times based on a poll they did with Siena that describes Americans’ sentiments concerning Israel’s actions in their war with Gaza. What does it say to me? This plot is saying that more Americans think that Israel is intentionally killing civilians than they did in December 2023. The change in percentage of people in the other categories seems to decrease accordingly. What do you like? I love slope plots! I think they’re a great...
Hey folks, This week I have an interesting figure for you from the Financial Times from an e-mail newsletter they distribute each week describing some visualization related to climate change. Before reading further, go ahead and spend a few minutes with the image. What does it say to you? What do you like? What don’t you like about it? How do you think you would go about making it in R? I’d encourage you to write down any of your answers to these questions before reading what I have to say....