Hey folks, If you’re interested in participating in a 1-day (6 hours) data visualization workshop, you’re running out of time to register. I’ll be teaching this workshop on May 9th. I will cover an introduction to the ggplot2 package and will assume no prior R knowledge. My goal is to help you to understand the ggplot2 framework and begin to apply it to make some interesting and compelling visualizations. After this workshop, you should be able to learn more advanced topics on your own. You can learn more and register by clicking the button below. Feel free to email me if you have any questions.
I recently got an interesting question to one of my videos: Another great video—thank you! I’m curious: do you think it’s more effective to recreate existing plots or to design one’s own in order to learn? For the past 9 months or so I’ve been taking the “recreating existing plots” approach. My pitch to you all was along the lines of “recreating masters”. Artists learn a lot by recreating original paintings to hone their technique and learn from past masters. There’s a lot of this type of content on YouTube whether it’s recreating music, movie scenes, or writing styles. What have I learned by doing these? First, there have been a number of commands I’ve learned that have been in Looking forward, I really need to push myself to recreate plots using visualizations I’m not so familiar with. Maps are a big deal. If you look back through my recent videos, you’ll find one map. Part of the problem is that I’ve become more pressed for time and don’t have the ability to invest so much time in learning how to remake plots. But I really want to do this. One thing I thought I’d be able to do is show you all how to recreate scientific plots that we find in papers. The reality is that (in my opinion) most of these plots are pretty dreadful or boring. How many times can I bring myself to recreate a plot originally made in Prism about something that only a handful of people care about? It gets stale. Recreating scientific plots also becomes utilitarian for my audience. People start asking for very specific types of visualizations that, again, a small fraction of the audience cares about. Everyone is generally interested in scatter plots, line plots, bar plots, heatmaps, and maps depicting data from politics, the economy, sports, or entertainment. I hate the utilitarian mindset. If you can’t see how to take a heat map depicting deaths by overdoses by age and year and adapt it to gene expression data, then I’m afraid I can’t help you. :fire: Earlier in this journey, I would recreate a plot for the first video of the week based on the logic I’d lay out in these newsletter postings. Then the second video of the week would either show a different way of making the same plot or do a makeover on the plot. This second video gets to the second option posed in the question above. Again, I’ve needed to spend time working on other things and now even a video a week is challenging. Needless to say, it’s become hard to do makeovers. I’m afraid that when I make plots “from scratch” they all kind of look alike. Perhaps I’m creative, but I’m creative within the boundaries that I’m used to working. It’s hard to try new things when I don’t know what new things to try. I get stuck in a rut of the types of visuals I make. I make a lot of jitter plots and box plots; rarely do I make any line plots. I rarely manipulate the theming - perhaps I shouldn’t need to? It’s hard to practice and expand your skills when you’re always making the same type of plot. One benefit of recreating visuals is that I can store ideas away in my brain for future visuals I make from scratch. For example, I now find axis tick marks to be extraneous. I also have warmed up to horizontal grid lines with the y-axis text on the grid line. I’ve fallen in love with “Libre Franklin”. Don’t worry, I still love “dodgerblue” as a color. But the question he posed is an interesting - how could I take what I’ve been learning to represent data? Last week you may recall that I talked about an interesting visualization that I found in an article about the baby boom on the “Our World in Data” website. Back in February, I recreated a heatmap depicting drug overdose deaths. I’ve started wondering whether I could take the ideas I learned making the heatmap and apply it to the baby boom data. With the baby boom data, the y-axis showed the birth year of women, the x-axis showed her age, and the height of the histogram and its fill color was the average fertility rate. I thought it was hard to see exactly where the baby boom occurred, the relative height of the histograms, and the broader distribution of fertility rate across age in more modern times. I’m pretty confident all of this can be resolved with a heatmap. How would we overcome these with a heatmap? I’d make the x-axis the year - not the birth year, but the actual year. The y-axis would be the age of the women, the fill color would be the fertility rate. Then, like the drug overdose heatmap, I’d overlay parallel lines showing the cohort of women who were part of the baby boom. I’d love to hear how you’ve been taking what you’ve learned from this series of videos and applied them to your own visualizations!
|
Hey folks, I need your feedback on an idea! Don’t worry, there’s some visualization stuff at the bottom. I had a video nearly ready to post this week using a ridgeline plot to show the baby boom. I think I did a great job of recreating the plot. But through a series of unfortunate events, I lost the video. I actually recorded the video three times because my computer kept crashing as I was recording it. This was on top of increasing busyness on my part with teaching, proposal writing,...
Hey folks, I really enjoyed teaching a one-day, introduction to ggplot2 workshop last week. It was a lot of fun - I enjoyed teaching the principles behind ggplot2. I’ve been noticing many learners (and teachers) focusing on making templates that they can recycle to make variations on a common plot type. This is how I often teach ggplot2 and the rest of the tidyverse - it’s also how I learned R. In the most recent workshop I was testing a hypothesis that teaching concepts would yield more long...
Hey folks, I’m gearing up to teach a 1-day (6 hours) data visualization workshop on May 9th. This workshop will cover an introduction to the ggplot2 package and will assume no prior R knowledge. My goal is to help you to understand the ggplot2 framework and begin to apply it to make some interesting and compelling visualizations. From this workshop, I hope that you would be able to go off on your own journey learning more advanced topics. You can learn more and register by clicking the button...