Some Data Manipulation in R with SPSS Variable Names and Labels
Following our conversation today, here are a few steps to take to use SPSS variable “labels” in place of variable names — we were working with a dataset that had variable names that were less informative than the variable labels, but that still did need to be referenced from time to time.
setwd("C:/Users/jleverni/Desktop")
# Import the SPSS file:
library('foreign')
data.wave2 <- read.spss("PDR Wave 2.sav", to.data.frame = TRUE, use.value.labels = TRUE)
# Note that the variable lables from SPSS have been imported as an "attribute" called "variable.labels":
str(data.wave2)
# We're about to overwrite the existing variable names with those variable labels. Thus, we'll save a backup copy of the existing variable names as an attribute first:
attr(data.wave2, "variable.originalName") <- colnames(data.wave2)
# Check our work:
str(data.wave2)
# Take all columns that have an attribute label (i.e., where the attribute label is not ""), and replace that column's name with that attribute label)
colnames(data.wave2)[attr(data.wave2, "variable.labels") != ""] <- attr(data.wave2, "variable.labels")[attr(data.wave2, "variable.labels") != ""]
# Check our work:
str(data.wave2)
# Make it easy to look up the original variable name, if we want to at any point:
lookupName <- function(columnName, dataFrame = data.wave2, attribute = "variable.originalName") {
print(attr(dataFrame, attribute)[which(colnames(dataFrame) == columnName)])
}
# Look up the original variable name of the column that is now titled "Lying", for example:
lookupName("Lying")
Thank you very much, it is very helpful.