|
Hey folks, Did you know that you can do statistics in R? HA! Of course it is. As the first sentence of its Wikipedia entry says, “R is a programming language for statistical computing and data visualization”. I rarely discuss using R for statistical analysis and focus far more attention on the data visualization power of R. This week, I’d like to share a set of panels from a figure in a paper recently published in Nature, “Lymph node environment drives FSP1 targetability in metastasizing melanoma”. This figure actually has 12 panels. One is a picture of the mouse model that was used (a) and another is an immunoblot (d). Panels i through l are the same style as e through h. I suspect that if you can figure out how to make the scatter plot in panel b, you can create the one in panel c. Similarly, if you can do the bar plots in panels e and f you can do those in g through l. Really, if you can do e, you should be able to do f. I’ll have things to say in a critique video that I’ll post on Monday, but let’s say you want to recreate these panels, how would you go about doing that in R? Before I forget, you can download the data as a MS Excel workbook from the Nature site. Let’s think about the scatter plot first. If you look at that workbook, you’ll notice that the data are very much not tidy! How would we get the data tidy? Well, first we need to read it in. We can use the Now to plot the data! We can generate the scatter plot using First is the fit and the confidence interval. To fit a line through data,
Next, we can set the color of the points. I do this a lot with The third more sophisticated element is the R^2^ value in the lower left corner of the plot. We can calculate the correlation coefficient, R, using Of course to finish replicating the original plot there will be a fair amount of styling to do to the axis titles, the legend, and the legend placement. This newsletter is already getting long, but a lot of the things we’d do for the scatter plot we could do here as well. If you look at the “Fig. 1e” sheet you’ll see it’s formatted a bit better than “Fig. 1b”. We’ll still need to tidy the data add a stage column and a couple of other bits before we can make the plot. To make the plots there are a few geom’s that we’ll need. First, the bars can be generated using Finally, how would we calculate and add the P-values to the plots? I’ll have more to say about this in my critique video and why I’m not a fan. Regardless, we can calculate the overall P-value (e.g. P< 1x10^-15^) using Now you have the data and the roadmap, see if you can’t figure out how to create these panels on your own. Also, before watching my critique of the panels, go through the DAIJ process on your own. Let me know what you come up with!
|
Hey folks, We had a lot of fun last week with my first workshop on the theory of data visualization! If this is something that you’d be interested in participating in let me know. At this point, I don’t have anything scheduled. So, if you have suggestions for days or times, please let me know This week I have a fun figure to share with you from a paper recently published in Nature Microbiology, titled, “Candida auris skin tropism and antifungal resistance are mediated by carbonic anhydrase...
Hey folks, Happy 2026! It’s great to be joining you on another trip around the sun as we explore data visualization, R, and reproducible research. Later today I’ll be hosting a workshop on the design of data visualizations. If you register ASAP, I can probably still get you in. If you missed this one, but would like to be notified when I run this workshop again, reply to this email and let me know! This week I found a pretty unique plot type in a paper published in the journal Nature This is...
Hey folks, What a year! This will be the last newsletter of 2025 and so it’s a natural break point to think back on the year and to look forward to the next. Some highlights for me have been recreating a number of panels from the collection of WEB DuBois visualizations on YouTube, recreating plots from the popular media, and modifying and recreating figures from the scientific literature. I guess you could say 2025 was a year of “recreating”! I have found this approach to making...