Some Data Manipulation in R with SPSS Variable Names and Labels

Following our conversation today, here are a few steps to take to use SPSS variable “labels” in place of variable names — we were working with a dataset that had variable names that were less informative than the variable labels, but that still did need to be referenced from time to time.


# Import the SPSS file:
data.wave2 <- read.spss("PDR Wave 2.sav", = TRUE, use.value.labels = TRUE) # Note that the variable lables from SPSS have been imported as an "attribute" called "variable.labels": str(data.wave2) # We're about to overwrite the existing variable names with those variable labels. Thus, we'll save a backup copy of the existing variable names as an attribute first: attr(data.wave2, "variable.originalName") <- colnames(data.wave2) # Check our work: str(data.wave2) # Take all columns that have an attribute label (i.e., where the attribute label is not ""), and replace that column's name with that attribute label) colnames(data.wave2)[attr(data.wave2, "variable.labels") != ""] <- attr(data.wave2, "variable.labels")[attr(data.wave2, "variable.labels") != ""] # Check our work: str(data.wave2) # Make it easy to look up the original variable name, if we want to at any point: lookupName <- function(columnName, dataFrame = data.wave2, attribute = "variable.originalName") { print(attr(dataFrame, attribute)[which(colnames(dataFrame) == columnName)]) } # Look up the original variable name of the column that is now titled "Lying", for example: lookupName("Lying")

One comment

Post a comment

You may use the following HTML:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>