Dealing with “missing”/out of bounds values in heatmaps

I was tinkering around in R to see if I could plot better looking heatmaps, when I encountered an issue regarding how specific values are represented in plots with user-specified restricted ranges. When I’m plotting heatmaps, I usually want to restrict the range of the data being plotted in a specific way. In this case, I was plotting classifier accuracy across frequencies and time of Fourier-transformed EEG data, and I want to restrict the lowest value of the heatmap to be chance level (in this case, 50%).

I used the following code to generate a simple heatmap:

decoding <- readMat("/Users/Pablo/Desktop/testDecoding.mat")$plotData
plotDecoding <- melt(decoding)

quartz(width=5,height=4)
ggplot(plotDecoding, aes(x = Var2, y = Var1)) +
geom_tile(aes(fill = value)) +
scale_fill_viridis(name = "",limits = c(0.5,0.75))

And this is what I got:

heatmap1

See the gray areas? What’s happening here is that any data points that fall outside of my specified data range (see the limits argument in scale_fill_viridis) are being converted to NAs. Apparently, NA values are plotted as gray by default. Obviously these values are not missing in the actual data; how can we actually ensure that they're plotted? The solution comes from the scales package that comes with ggplot. Whenever you create a plot with specified limits, include the argument oob = squish (oob = out of bounds) in the same line where you set the limits (make sure that the scales package is loaded). Ex:

scale_fill_viridis(name = "",limits = c(0.5,0.75),oob=squish)

What this does is that values that fall below the specified range are represented as the minimum value (color, in this case), and values that fall above the range are represented as the maximum value (which seems to be what Matlab defaults to). The resulting heatmap looks like this:

heatmap2

A simple solution to an annoying little quirk in R!

Looking at interactions of continuous variables

So we had an unexpected interaction effect between two continuous predictors in our dataset. I thought it was some kind of ceiling effect so I did some plots to see what was causing it. I still think it’s a ceiling effect, but I also found it interesting that the effect of predictor 1 is only really present at average levels of predictor 2. Anyway, I’ve attached my code. If you have an interaction that you can’t quite understand, these plots may help: http://rpubs.com/abcharlton/interaction-plots.

download-1

download

Conditional Density Plots in ggplot2

My distance contribution to ggplot2 day 🙂 http://rpubs.com/rosemm/cond_density_plots

The example is with some real data of mine, which I can’t share with you all just yet, sadly. You can apply it pretty quickly to other data, though, for example:
ggplot(diamonds, aes(carat, ..count.., fill = cut)) +
geom_density(position = "fill")

This post (http://stackoverflow.com/questions/14570293/special-variables-in-ggplot-count-density-etc) has some information about the weird ..count.. thing going on in the aes() mapping, for those who are curious. I also encourage you to play around with the adjust argument in geom_density() to see how that changes your plot. Have fun!

Fall 2016 Tentative Schedule

R Club meets every Thursday from 2-3:30 PM in 006 Straub.

Topic R Packages
Week 2 Data visualization ggplot2
Week 3 Data wrangling dplyr, tidyr
Week 4 Reproducible, dynamic reports rmarkdown
Week 5 Programming, simulation and elegant coding
Week 6 Simulation
Week 7 Regression, principal components analysis (PCA)
Week 8 Structural equation modeling (SEM) lavaan
Week 9 Thanksgiving
Week 10 Neuroimaging in R or meta-analysis metafor?