What have I learned by recreating other people's visualizations?


Hey folks,

If you’re interested in participating in a 1-day (6 hours) data visualization workshop, you’re running out of time to register. I’ll be teaching this workshop on May 9th. I will cover an introduction to the ggplot2 package and will assume no prior R knowledge. My goal is to help you to understand the ggplot2 framework and begin to apply it to make some interesting and compelling visualizations. After this workshop, you should be able to learn more advanced topics on your own. You can learn more and register by clicking the button below. Feel free to email me if you have any questions.


I recently got an interesting question to one of my videos:

Another great video—thank you! I’m curious: do you think it’s more effective to recreate existing plots or to design one’s own in order to learn?

For the past 9 months or so I’ve been taking the “recreating existing plots” approach. My pitch to you all was along the lines of “recreating masters”. Artists learn a lot by recreating original paintings to hone their technique and learn from past masters. There’s a lot of this type of content on YouTube whether it’s recreating music, movie scenes, or writing styles.

What have I learned by doing these? First, there have been a number of commands I’ve learned that have been in {ggplot2} for a long time, but I didn’t know existed. Take, for instance, annotate() or scale_color_identity(). Second, I’ve been forced to make plots that I wouldn’t normally make. Feel free to check out the entire series of WEB DuBois plots that I’m still recovering from. Third, I’ve gotten a lot more fluid with certain functions and packages. Most notably, my ability to use theme() has really grown. All of this growth has occurred because I didn’t worry about my creativity, but leaned on the style of others - whether it’s WEB DuBois or the NY Times.

Looking forward, I really need to push myself to recreate plots using visualizations I’m not so familiar with. Maps are a big deal. If you look back through my recent videos, you’ll find one map. Part of the problem is that I’ve become more pressed for time and don’t have the ability to invest so much time in learning how to remake plots. But I really want to do this.

One thing I thought I’d be able to do is show you all how to recreate scientific plots that we find in papers. The reality is that (in my opinion) most of these plots are pretty dreadful or boring. How many times can I bring myself to recreate a plot originally made in Prism about something that only a handful of people care about? It gets stale.

Recreating scientific plots also becomes utilitarian for my audience. People start asking for very specific types of visualizations that, again, a small fraction of the audience cares about. Everyone is generally interested in scatter plots, line plots, bar plots, heatmaps, and maps depicting data from politics, the economy, sports, or entertainment. I hate the utilitarian mindset. If you can’t see how to take a heat map depicting deaths by overdoses by age and year and adapt it to gene expression data, then I’m afraid I can’t help you. :fire:

Earlier in this journey, I would recreate a plot for the first video of the week based on the logic I’d lay out in these newsletter postings. Then the second video of the week would either show a different way of making the same plot or do a makeover on the plot. This second video gets to the second option posed in the question above. Again, I’ve needed to spend time working on other things and now even a video a week is challenging. Needless to say, it’s become hard to do makeovers.

I’m afraid that when I make plots “from scratch” they all kind of look alike. Perhaps I’m creative, but I’m creative within the boundaries that I’m used to working. It’s hard to try new things when I don’t know what new things to try. I get stuck in a rut of the types of visuals I make. I make a lot of jitter plots and box plots; rarely do I make any line plots. I rarely manipulate the theming - perhaps I shouldn’t need to? It’s hard to practice and expand your skills when you’re always making the same type of plot.

One benefit of recreating visuals is that I can store ideas away in my brain for future visuals I make from scratch. For example, I now find axis tick marks to be extraneous. I also have warmed up to horizontal grid lines with the y-axis text on the grid line. I’ve fallen in love with “Libre Franklin”. Don’t worry, I still love “dodgerblue” as a color. But the question he posed is an interesting - how could I take what I’ve been learning to represent data?

Last week you may recall that I talked about an interesting visualization that I found in an article about the baby boom on the “Our World in Data” website.

Back in February, I recreated a heatmap depicting drug overdose deaths.

I’ve started wondering whether I could take the ideas I learned making the heatmap and apply it to the baby boom data. With the baby boom data, the y-axis showed the birth year of women, the x-axis showed her age, and the height of the histogram and its fill color was the average fertility rate. I thought it was hard to see exactly where the baby boom occurred, the relative height of the histograms, and the broader distribution of fertility rate across age in more modern times. I’m pretty confident all of this can be resolved with a heatmap.

How would we overcome these with a heatmap? I’d make the x-axis the year - not the birth year, but the actual year. The y-axis would be the age of the women, the fill color would be the fertility rate. Then, like the drug overdose heatmap, I’d overlay parallel lines showing the cohort of women who were part of the baby boom.

I’d love to hear how you’ve been taking what you’ve learned from this series of videos and applied them to your own visualizations!

Workshops

I'm pleased to be able to offer you one of three recent workshops! With each you'll get access to 18 hours of video content, my code, and other materials. Click the buttons below to learn more

In case you missed it…

Here is a livestream that I published this week that relate to previous content from these newsletters. Enjoy!

video preview

Finally, if you would like to support the Riffomonas project financially, please consider becoming a patron through Patreon! There are multiple tiers and fun gifts for each. By no means do I expect people to become patrons, but if you need to be asked, there you go :)

I’ll talk to you more next week!

Pat

Riffomonas Professional Development

Read more from Riffomonas Professional Development

Hey folks, As I mentioned last week, I’m exploring the possibility of holding live, in person, workshops again like I did before the pandemic. If this is something that interests you, please let me know. My thought would be to hold them at an affordable hotel near the Detroit airport (DTW). But, if you would like to host me to teach a workshop, I would be open to that as well. This week, I want to call your attention to a plot that I would not encourage you to make. This comes form “Targeted...

Hey folks! I’m hoping to host two workshops in March and April. The first would be a Zoom-based workshop on the principles of data visualization (I taught a version of this last month). This would be a code-free workshop and would run for about 3 hours. I don’t have a date yet. If you are interested, please reply to this email and let me know if there is a date and time in March that would work best for you. The second would be an in person 3 day workshop taught near the Detroit airport. I...

Hey folks, We had a lot of fun last week with my first workshop on the theory of data visualization! If this is something that you’d be interested in participating in let me know. At this point, I don’t have anything scheduled. So, if you have suggestions for days or times, please let me know This week I have a fun figure to share with you from a paper recently published in Nature Microbiology, titled, “Candida auris skin tropism and antifungal resistance are mediated by carbonic anhydrase...