|
Hey folks, This week I want to share with you a figure that resembles many a type of figure that I see in a lot of genomics papers. I’d consider it a data visualization meme - kind of like how you’re “required” to have a stacked bar plot if you’re doing microbiome research or a dynamite plot if you’re publishing in Nature :) This figure was included in the paper, “Impact of intensive control on malaria population genomics under elimination settings in Southeast Asia” that was published earlier this week in Nature Microbiology. I’ve been wanting to look at this type of figure for a while, but haven’t because rarely do the authors make the underlying data available. This group of authors did, so we’ll be able to recreate it in a livestream next Wednesday. Yippee! I often feel bad that I only seem to put people’s figures under a microscope if they Do The Right Thing and make their papers open and data accessible. The basic structure of these figure is a tree (i.e., a dendrogram) linked to one or more heat maps. In the figure above, you can see there’s a dendrogram on the left and the structure of the tree roughly matches the red blocks along the diagonal of the matrix to its right. Red indicates the malaria parasite genomes are more similar and yellow that they are quite different. There are nearly 3 million data points in that heat map. On top of the large heat map are three strips indicating whether the genome was sampled from a location before or after using “mass drug administration” (MDA) or not using it at all; the year; and the genotype of the kelch13 gene, which can confer resistance to artemisinin. Those three strips are actually heat maps - they only have one value on the y-axis and 1700 on the x-axis. Often I see the three horizontal bands as vertical columns. But this is a similar idea. How would we build this? I see this as 5 figures - the dendrogram, the large red/yellow heat map, and the three horizontal bands. Similar to last week, we compose this figure using the Let’s start with the dendrogram. Using the relatedness matrix, we can generate the data for a dendrogram using the base R Next, let’s consider the large heat map. Heat maps can be created using the Similar to the large heat map, we can use One thing I notice about the legends in this panel is that they’re organized in a somewhat haphazard manner. The gradient legend is not vertically aligned with the legends for the three bands. To me, this looks weird. Using On Monday, I’ll present a critique of this figure. Something I already foresee doing is moving those horizontal bands to be vertical and on the right side. What else would you do to improve the appearance of this figure?
|
Hey folks! I hope you enjoyed last week’s series on the radial volcano plot (newsletter, critique video, livestream). I think it did a good job of illustrating the various reasons I think it’s valuable to recreate figures, even if we don’t like how they display the data. Something I didn’t really emphasize in last week’s newsletter was that by recreating a figure, we can make sure that the data are legit. I’m surprised by the number of signals I’ve been finding where authors using tools like...
Hey folks! Before launching into this week’s visualization, I’m looking for a bit of feedback. Since November, I’ve settled into a new routine with this newsletter and the YouTube channel. Each week this newsletter introduces a visualization at a 30,000 ft view or discusses a specific topic in some depth (example). The following Monday I post a video critiquing the visualization (example). Then on Wednesday (or Tuesday like this past week), I livestream a video where I recreate the...
Hey folks! I just got back from a seminar. I’m still trying to stretch out my eyes from straining to see the small text on each slide! If you don’t know why I’m brining this up, then you must have missed the videos I posted earlier this week. I was discussing the factors we should consider when converting figures designed for papers to figures designed to a slide deck. You can see me critique a figure from my own lab here and the livestream where I refactor the figure can be found here. I’d...