PyClub?
I’m reading more and more about analysts switching from R to Python. From what I can tell, python is faster, especially with big datasets, but R has more stats-specific functionality available (although that’s changing rapidly as more packages are added to python). Here’s some discussion of python as a stats tool, including a list of stats-friendly packages in python. Here have been several more packages added since that post, however, including a python tutorial on statistical analysis, and a version of ggplot2 for python (blog post, code). Probably more – I have no idea, but I’m intrigued.
Does anyone have experience with python? If you start PyClub, I’ll totally attend (and I’ll bring pie, at least once).
Code Academy has a set of tutorials to learn different programing languages. They say you can become proficient in a code within a year. The link to the tutorials for Python is below:
http://www.codecademy.com/tracks/python
I also like using the GUI PsychoPy. It is very easy to use! There is a component of it where you can write code directly, which is helpful because the GUI is not that flexible on its own. Below are the links for the program and a great introduction by the creator:
http://www.psychopy.org/
http://www.youtube.com/watch?v=VV6qhuQgsiI
Enjoy!
A few more python resources:
http://vimeo.com/pydata/videos (these are videos from the PyData conference, which is a meeting about using python for data analysis)
https://developers.google.com/edu/python/
https://www.coursera.org/course/interactivepython
Ooo also this: http://statsmodels.sourceforge.net/devel/
Using R-style formulas on pandas data frames to run all the stats!
I just saw this post and thought I’d throw in. I’ve switched over to Python for the current project I’m working on (very large time-series data sets), and I’ve found it to be substantially easier to learn than R, thanks in a large part to the amazing tutorials online, and the fact that every question I have about language functionality (even alarmingly specific ones) have been asked several times on Stackoverflow and other forums.
A nice thing about the pandas module (which essentially adds in R-like data frame objects) is that it makes it simple to switch between Python and R, just writing and reading your dataframe as a csv. So you can do data manipulation and basic analysis in Python, then do more sophisticated stats in R. Actually, the available statistics in Python are growing by the week. Check out scikit for things like PCA and model selection, or PyMC for Bayesian analysis. I’m all in favor of someone (else) starting PyClub, and would be happy to give a pandas walkthrough.