October 2016 – R Club

by pablom - October 21, 2016

Dealing with “missing”/out of bounds values in heatmaps

I was tinkering around in R to see if I could plot better looking heatmaps, when I encountered an issue regarding how specific values are represented in plots with user-specified restricted ranges. When I’m plotting heatmaps, I usually want to restrict the range of the data being plotted in a specific way. In this case, I was plotting classifier accuracy across frequencies and time of Fourier-transformed EEG data, and I want to restrict the lowest value of the heatmap to be chance level (in this case, 50%).

I used the following code to generate a simple heatmap:

decoding <- readMat("/Users/Pablo/Desktop/testDecoding.mat")$plotData
plotDecoding <- melt(decoding)

quartz(width=5,height=4)
ggplot(plotDecoding, aes(x = Var2, y = Var1)) +
geom_tile(aes(fill = value)) +
scale_fill_viridis(name = "",limits = c(0.5,0.75))

And this is what I got:

See the gray areas? What’s happening here is that any data points that fall outside of my specified data range (see the limits argument in scale_fill_viridis) are being converted to NAs. Apparently, NA values are plotted as gray by default. Obviously these values are not missing in the actual data; how can we actually ensure that they're plotted? The solution comes from the scales package that comes with ggplot. Whenever you create a plot with specified limits, include the argument oob = squish (oob = out of bounds) in the same line where you set the limits (make sure that the scales package is loaded). Ex:

scale_fill_viridis(name = "",limits = c(0.5,0.75),oob=squish)

What this does is that values that fall below the specified range are represented as the minimum value (color, in this case), and values that fall above the range are represented as the maximum value (which seems to be what Matlab defaults to). The resulting heatmap looks like this:

A simple solution to an annoying little quirk in R!

by Aaron Charlton - October 5, 2016

Looking at interactions of continuous variables

So we had an unexpected interaction effect between two continuous predictors in our dataset. I thought it was some kind of ceiling effect so I did some plots to see what was causing it. I still think it’s a ceiling effect, but I also found it interesting that the effect of predictor 1 is only really present at average levels of predictor 2. Anyway, I’ve attached my code. If you have an interaction that you can’t quite understand, these plots may help: http://rpubs.com/abcharlton/interaction-plots.

download-1

download

by John Flournoy - October 5, 2016

Hitchhiker’s Guide to Data Science

So, you want to be a ~~quantitatively sophisticated social science researcher~~ Data Scientist? You might want to check out this amazing roundup of resources called the Hitchhiker’s Guide to Data Science, Machine Learning, R, Python. Enjoy!

by rosemhartman - October 4, 2016

Conditional Density Plots in ggplot2

My distance contribution to ggplot2 day 🙂 http://rpubs.com/rosemm/cond_density_plots

The example is with some real data of mine, which I can’t share with you all just yet, sadly. You can apply it pretty quickly to other data, though, for example:ggplot(diamonds, aes(carat, ..count.., fill = cut)) + geom_density(position = "fill")

This post (http://stackoverflow.com/questions/14570293/special-variables-in-ggplot-count-density-etc) has some information about the weird ..count.. thing going on in the aes() mapping, for those who are curious. I also encourage you to play around with the adjust argument in geom_density() to see how that changes your plot. Have fun!

by Aaron Charlton - October 3, 2016October 4, 2016

Fall 2016 Tentative Schedule

R Club meets every Thursday from 2-3:30 PM in 006 Straub.

	Topic	R Packages
Week 2	Data visualization	ggplot2
Week 3	Data wrangling	dplyr, tidyr
Week 4	Reproducible, dynamic reports	rmarkdown
Week 5	Programming, simulation and elegant coding
Week 6	Simulation
Week 7	Regression, principal components analysis (PCA)
Week 8	Structural equation modeling (SEM)	lavaan
Week 9	Thanksgiving
Week 10	Neuroimaging in R or meta-analysis	metafor?

Month: October 2016

Dealing with “missing”/out of bounds values in heatmaps

Looking at interactions of continuous variables

Hitchhiker’s Guide to Data Science

Conditional Density Plots in ggplot2

Fall 2016 Tentative Schedule

Subscribe By Email