|
Hey folks, In last week’s newsletter, I introduced a new approach that I plan on taking in these emails to help you develop your intuition with visualizing data in R (or any language). I asked you to consider a random figure that I found in the most recent issue of the journal mSphere. It’s Figure 1A from the paper, “Exploring novel microbial metabolites and drugs for inhibiting Clostridioides difficile” by Ahmed Abouelkhair and Mohamed Seleem. The figure shows the level of inhibition of bacterial growth by 527 compounds; 63 of the compounds were deemed “strong hits” because they inhibited growth by at least 90%. Without worrying about actual code, I encouraged you to think about the data and functions you’d need to generate this figure. Here were my random thoughts: This is a scatter plot with compounds giving more than 90% inhibition were a burgundy color and those with less were given a green color. There’s also a dashed line indicating the 90% threshold. It took me a minute or two to notice that the x-axis is meaningless. It’s likely the order of the compounds in their database (there seems to be a non-random pattern to the data about 3/4th the way across the axis). I also noticed that there’s no line on the x-axis, but there is a line at zero. Those are the parts of the figures, described in a way that you could probably use to make a similar looking figure with any tool. Now, how would we do this in R? Let’s start with the data. I assume that the data will be a data frame with two columns, one for the compound name ( I do everything in ggplot2 nowadays, so I start thinking about what geom I’ll use. Probably Next, I’d think about the colors. I’d use Let’s move on to the x-axis and the two lines. First, I’d use the Now let’s think about the y-axis. By default we might get the values on the y-axis that the figure already has. But to be safe, we can use I think that’s everything, right? I’d encourage you to go back through that narrative and assess what you do and don’t understand. Then look at online R resources, including my Riffomonas materials (MinimalR and generalR) and the R Graphics Cookbook for examples of how to use the new concepts. Finally, see if you can generate the figure yourself using some simulated data. The code below should be close enough to what you need:
Please let me know how this works out for you! Also, if you have a favorite figure that you'd love to see me break down, reply to this email and I'll see about using it in a future newsletter
|
Hey folks! I posted two videos last week! On Monday I posted a video critiquing the diverging bar plot that I described in this newsletter last Friday. My goal in this video was to think through a “constructive” approach to interpreting and critiquing data visualizations. As scientists, I think we are too worried about hurting each other’s feelings. So we don’t critique each other. At the same time, many of us think before we speak and can come off overly harsh. My goal is to create a...
Hey folks! As I’m writing this newsletter the US government is in shutdown mode with no clear signs that things will get going anytime soon. I’ll withhold my own political take except to say that my family has been running without an official budget for about 25 years. I don’t recommend it, but we know basically how much money goes to our mortgage, insurance, groceries, charities, etc. and how much money we generally have left over. Somehow we still are able to spend money on living a pretty...
Hey folks! This week I have a figure for you from the New York Times based on a poll they did with Siena that describes Americans’ sentiments concerning Israel’s actions in their war with Gaza. What does it say to me? This plot is saying that more Americans think that Israel is intentionally killing civilians than they did in December 2023. The change in percentage of people in the other categories seems to decrease accordingly. What do you like? I love slope plots! I think they’re a great...