Fun with R-Studio and the Lubridate Library
The statistical package R, and its complement interface R-Studio, combine to form an extremely power team for statistical analysis. And, although the program is command-line driven, it is much more intuitive to use than Microsoft’s Excel program. Recently, a colleague of mine was going nuts trying to get Excel to graph a data set. I was approached for help, and I suggested using R and felt it would do the job quite quickly. Well, it was a bit of an overstatement as we had to reorganize 34,000 data points with a new date and time format. Excel always wanted to change the date to a different format than was needed (Again, Thanks Excel!) and once we overcame that problem, we could save the data as comma separated value (.csv) file that we could use for R. The ggplot2 library/tool is a main graphing tool for R and in many cases, it might be all that is needed for a graph. However, in this case we had data organized by year, month, day, and time of day. Plotting data with this long date/time string without losing the sequence of the data required using a more recent library/tool in R called Lubridate. Once that was added to our analysis sequence, we could just enter the command in R to read the .csv file and allow the program to plot what it can. The program will then plot everything it is able to, based on the columns of data in the file. In this case there were 6 or more options and the one needed by my colleague was one of them. We then focused on writing a short script to format the graph, adding appropriate information for the program to create a title, named axes and a key. So, although it took a bit more than a few minutes, we did have the needed data graphed in time for a presentation occurring within the day. The bottom line of this conservation is that if you working with data and you’re a not using R, at least for overarching analysis, you are really missing out on a program that will save you time and keeping you sane.