Dealing with missing data via multiple imputation

Here’s a little teaser for one of tomorrow’s nuggets.

I’ll be talking about using multiple imputation as a remedy for missing data, using the Amelia package. To whet your appetite, check out this pithy post about how R handles missing values in general:

Multiple Imputation!

Here’s a link to a webinar on missing data (you need to register with your email address to get access to the videos):

Here’s a link to an Rpubs handout:

And here’s all the relevant code:


data() # Amelia comes with some datasets
data(africa) # let's pull in the africa dataset
summary(lm(africa$civlib ~ africa$trade)) # listwise deletion

m <- 5 # the number of datasets to create (5 is typical) a.out <- amelia(x = africa, cs = "country", ts = "year", logs = "gdp_pc") # note that we're using all the variables, even though we won't use them all in the regression summary(a.out) plot(a.out) par(mfrow=c(1,1)) missmap(a.out) # run our regression on each dataset b.out<-NULL se.out<-NULL for(i in 1:m) { ols.out <- lm(civlib ~ trade ,data = a.out$imputations[[i]]) b.out <- rbind(b.out, ols.out$coef) se.out <- rbind(se.out, coef(summary(ols.out))[,2]) } # combine the results from all of the different regressions combined.results <- mi.meld(q = b.out, se = se.out) ?AmeliaView # Sounds fun, but it didn't work for me. Meh.


Post a comment

You may use the following HTML:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>