Hey folks, As you know, I’ve been encouraging folks to strengthen their intuition of how to make different types of plots in R. Some of that is through this newsletter. Some of that is by engaging in mob programming with your research group and others. I received a great email this week from Devin Drown, a professor at the University of Alaska in Fairbanks, that just made my day: I want to thank you for sending out this newsletter and the past few with visualizations. This past week, I used this box plot challenge for the students in my lab group. Most hadn’t seen your newsletter post, so it was a fresh challenge for them. We even tried our hand at the mob programming (Driver/Navigator) approach that you mentioned before. What a really fun way to engage lots of learners in programming. We had a wide variety of experience levels, and I was really impressed at how this method can help. This unsolicited endorsement really speaks to what I am trying to do with Riffomonas. If you don’t have a local community that you can lean on to do mob programming, I’m offering sessions throughout October for anyone who is interested. The goal is to provide greater opportunities to hone your skills in a social setting. Of course, it is best to do it with people who are on a journey with you at your institution. Doing it via Zoom is the next best thing. Each week when I search for figures to show you for this newsletter, I come across a number of things that aren’t what we typically think of as plots we’d made with R. One of the more common figures I come across are Venn diagrams. Here is one of many possible examples take from Figure 1 of “Exploring pangenomic diversity and CRISPR-Cas evasion potential in jumbo phages: a comparative genomics study” by Sharayu Magar and colleagues (Who doesn’t love jumbo phages?!). As with anything you might do with R, there are many ways to make this figure. Before reading on, how would you make a Venn diagram in R? Three general approaches come to mind immediately. First, there’s probably a package out there to draw geometric object or even make Venn diagram. But where’s the fun in that and what would we learn? Second, we could represent each circle as a line plot. The points along the lines could be generated using the equation of a circle and stored in a data frame. For those of you who just reached for PowerPoint, don’! There’s a third approach. Instead of thinking of each circle in a venn diagram as a line or ribbon plot, think about a scatter plot. Let’s think about a data frame with three columns and two rows. The first, Let’s think about the various aesthetics that are available to us with Once we generate our “scatter plot”, add Next, we want to adjust the color so we don’t have overlapping black circles. Do you recall how we can map a variable to a color? We can put the variable, like Another subtle difference between our version of the Venn diagram and the published version is that the published version has a solid black border around the circles. That gives a pretty cool look that I like. If you’ve ever wondered what plotting character values between 21 and 24 are for, here’s a use case. Those plotting symbols allow you to use one color for the interior of the symbol and another for the border. Let’s use If your Venn diagram is looking like mine, you’ll notice that the black line is actually gray. Why is that? It appears that Let’s add those labels! Let’s start by adding the labels above the two circles using Next, let’s insert the numbers in those circles. We can do that by creating a second data frame that we add to the plot with a second
This is a really cool function to add annotations to your figure. Once you get this to work, see if you can use it to replace the earlier The last thing to modify is the background. It still looks like a plot. There’s a special You might be asking, why would anyone go through all of this when you could just use Micro$oft PowerPoint? The most important is that you could include a script to generate a Venn diagram that takes the label and number values from your data. The figure will be automatically get updated if your upstream analysis is changed. Scripting a figure like this is also valuable if you need to generate a bunch of Venn Diagrams. Of course, think of all of the awesome things you just learned! Finally, I’d encourage you to make a three circle Venn diagram using what you’ve learned from making a two circle diagram. I think diagrams with more groups would require using the equation of an ellipse. While I’m handing out homework that I don’t have to grade, see if you can make a two and three circle diagram using the equation of a circle! As I come to the end of the current YouTube channel series building an R package, let me know whether you’d like me to take this verbal analysis of figures and translate it to real R code that I develop in video form. I’m always interested in the types of figures you’d like to see how to make in R - feel free to email me with ideas!
|
Hey folks, I’m gearing up to teach a 1-day (6 hours) data visualization workshop on May 9th. This workshop will cover an introduction to the ggplot2 package and will assume no prior R knowledge. My goal is to help you to understand the ggplot2 framework and begin to apply it to make some interesting and compelling visualizations. From this workshop, I hope that you would be able to go off on your own journey learning more advanced topics. You can learn more and register by clicking the button...
Hey folks, Long time friends of Riffomonas know that I’ve been teaching data science classes for close to 20 years. The hallmark of my teaching has been three-day workshops where I either teach R (here and here) or the mothur software package. I’ve gotten feedback that three days is just too much time for people to carve out of their busy schedules. So, I’m excited to be offering a 1-day (6 hours) data visualization workshop on May 9th. This will cover an introduction to the ggplot2 package....
Hey folks, I’m really excited to be offering a 1-day (6 hours) data visualization workshop on May 9th. It will cover the basics of ggplot2. If you’ve been following along this newsletter for anytime, you know I’ve thought a lot about how we learn. A critical element of learning is to create a mental model that we can hang ideas on to flesh out our understanding of a concept. The “grammar of graphics” is one such mental model for building plots. It is instantiated in ggplot2 - that’s the “gg”...