Wide to long!

Melissa has a lovely, complex longitudinal dataset, and she needs it reformatted from wide to long. She made a dummy version of the dataset for us to play with, with all of the correct column names but randomly generated data. Side note: If you ever need to, you can do this with your own data, too, with a couple quick commands:

dummy.data = matrix(data=rnorm(n=dim(my.data)[1] * dim(my.data)[2]), ncol=dim(my.data)[2])
dummy.data = as.data.frame(dummy.data)
colnames(dummy.data) = colnames(my.data)

Here is her fake dataset. The data are currently in wide format, with everything for each participant in one row. Melissa would like it with four rows for each participant, for each of the four time points, and then all of the measures as columns. Be careful, though – several of the measures were not assessed at every time point!  So, for example, in wide format there might be four columns for Measure A (MeasureA_T1,  MeasureA_T2, MeasureA_T3, MeasureA_T4) but only two columns for Measure B (MeasureB_T1, MeasureB_T4). Like this:

Before…
Subj A_T1 A_T2 A_T3 A_T4 B_T1 B_T4
1 -0.9718834 0.6276666 -0.5228272 -0.6400311 -0.5308298 1.6060458
2 -1.9756228 0.5156263 -0.4056288 0.4194956 0.3696320 -1.8104448
3 -0.1338431 -1.5766120 -0.1066071 -0.3906287 0.6693724 -0.2181601
4 -0.8970927 0.1203673 -0.3418302 0.0911664 1.6115600 -0.8444743
5 1.6684360 1.1707871 -0.4292292 -0.5215027 0.5450576 -0.4567965

 

In the end, there should be just one column for each measure (e.g. MeasureA, MeasureB), and four rows for each participant, with NAs entered where the measure was not assessed. Like this:

After!
Subj time A B
1 1 1.0107164 1.0408222
1 2 -0.5727259 NA
1 3 -0.9831144 NA
1 4 -1.1895119 -1.9333893
2 1 -0.8442759 0.9567628
2 2 -0.5484172 NA
2 3 0.0375418 NA
2 4 -1.4851271 -1.1012295
3 1 -0.0313766 -1.5844977
3 2 -0.8810030 NA
3 3 -0.4638228 NA
3 4 0.3639906 -1.8326602
4 1 0.6995141 0.4447129
4 2 0.1064798 NA
4 3 0.1506644 NA
4 4 -1.1496762 -0.5327433
5 1 -0.8448360 0.3588507
5 2 2.1525641 NA
5 3 -1.9898227 NA
5 4 -0.7260369 1.7552476

 

Except there are like a hundred variables (mwah-ha-ha-ha!). Have fun!

Comments are closed.