Some Data Manipulation in R with SPSS Variable Names and Labels

Following our conversation today, here are a few steps to take to use SPSS variable “labels” in place of variable names — we were working with a dataset that had variable names that were less informative than the variable labels, but that still did need to be referenced from time to time.


setwd("C:/Users/jleverni/Desktop")

# Import the SPSS file:
library('foreign')
data.wave2 <- read.spss("PDR Wave 2.sav", to.data.frame = TRUE, use.value.labels = TRUE) # Note that the variable lables from SPSS have been imported as an "attribute" called "variable.labels": str(data.wave2) # We're about to overwrite the existing variable names with those variable labels. Thus, we'll save a backup copy of the existing variable names as an attribute first: attr(data.wave2, "variable.originalName") <- colnames(data.wave2) # Check our work: str(data.wave2) # Take all columns that have an attribute label (i.e., where the attribute label is not ""), and replace that column's name with that attribute label) colnames(data.wave2)[attr(data.wave2, "variable.labels") != ""] <- attr(data.wave2, "variable.labels")[attr(data.wave2, "variable.labels") != ""] # Check our work: str(data.wave2) # Make it easy to look up the original variable name, if we want to at any point: lookupName <- function(columnName, dataFrame = data.wave2, attribute = "variable.originalName") { print(attr(dataFrame, attribute)[which(colnames(dataFrame) == columnName)]) } # Look up the original variable name of the column that is now titled "Lying", for example: lookupName("Lying")

One comment