Playing with priors code!

I wrote up some code (not all of it has been tested yet, though!) for possible implementations of some prior manipulations on the simple regression model we looked at this afternoon. Here is a link to the file: https://www.dropbox.com/s/3r5hlx78zrptqy3/playing%20with%20priors.r

It’s just the JAGS model portion  for each manipulation. If you have any code you came up with, please feel free to add it to this post as a comment!

Next Bayes Club: Playing with Priors

During tomorrow’s Bayes Club, we want to 1) get some more practice reading and interpreting JAGS models, and 2) get a feel for how your priors influence your results.

So, having already gone through the model for estimating the difference between two groups (i.e. the “BEST” model), we’ll check out the code for exploring the relationship between two continuous predictors (conceptually similar to a simple regression). For the sake of consistency in style, etc., we’re using another of Kruschke’s models for this. Download it here: http://www.indiana.edu/~kruschke/DoingBayesianDataAnalysis/Programs/ProgramsDoingBayesianDataAnalysis.zip (The model we’ll be looking at is called “SimpleLinearRegressionJags-Original.r”)

The mechanics for running this model are almost exactly the same as what we saw in the BEST example, it’s mostly just the JAGS model itself that’s different. We’ll walk through it, and link it up to a graphical representation of the model, like we did for BEST.

Then, we’ll ask you to play around changing the priors on this model (and/or the BEST model, if you like). We’re hoping that you’ll come away with 1) a better understanding of what distributions are available in JAGS and what they look like, and 2) a better sense of how changes to your priors affect your posterior. To encourage maximum self-directed hands-on-ness, we’d like you all to play around with this yourselves on your own machine for the second half of the meeting. Here are some ideas to get you started:

  • To begin with, the prior for the residuals is a normal distribution. Change it to a t distribution, and, for extra points, infer the degree of normalness (i.e. the nu parameter, that operates as df in the t dist to determine how chunky the tails are).
  • Strongly assume that the null hypothesis is true (i.e. that these two variables do NOT have a relationship, so the slope should be 0). Try this one several ways.
  • Standardize your variables, and then set a uniform prior on the slope, such that it ranges from -1 to 1, with all possible values equally likely. 
  • Assume that the slope can only be positive (like a one-directional test).
  • Tweak the values for the parameters on the priors Kruschke provides (for example, play around with the parameters for the gamma distribution that defines the prior for precision).
  • Choose your own adventure!

Each time you change a prior, try plotting it first so you can see what it looks like. Typically, the easiest way to do this is plot(function(x)dfunc(x, parameters)) where “dfunc” is the name of the distribution and “parameters” is whatever parameters you need to set for that distribution. For example, you can plot a normal curve with mean=0 and SD=1 with plot(function(x)dnorm(x, mean=0, sd=1)).

Spoiler alert: Most changes you make to your priors won’t have that much of an effect on the posterior if you have enough data. When you have a smaller N and/or messier data (weaker relationships), the priors have more sway. Try changing your sample size as well as changing the priors, and see where the balance is. 

Next Bayes Club: Getting our hands dirty with some R and JAGS

Our next Bayes Club meeting (this Tuesday) will be a close analysis of Kruschke’s BEST model, which is a model for comparing two groups to see if they’re drawn from populations with different means (i.e. something you would do instead of a t-test). You can download all of his code for this model (and many others) from his website:

http://www.indiana.edu/~kruschke/BEST/BEST.zip

He has written an entire article specifically focusing on this model including details about the code, analysis, etc. Definitely skim it before Tuesday if you get a chance:

http://www.indiana.edu/~kruschke/articles/Kruschke2013JEPG.pdf

Or, if you’re sick of reading things all day long, watch the video instead: http://youtu.be/fhw1j1Ru2i0

 And if you want to play around with a simulation, check this out: http://sumsar.net/best_online/

We’ll spend almost the entire meeting on Tuesday going through the graphical model in detail and going through all the R and JAGS code line by line. We hope that by the end of the day, you’ll be able to manipulate the model yourself (e.g. change the number of groups to do a one-way ANOVA instead of a t-test, etc.). If you don’t already have R and JAGS ready to go on your computer, please try to get them installed before we meet. And, of course, email me or Jacob with any questions.

Extra tidbit: We won’t go through this on Tuesday, but as you move forward, you may want to have a copy of the JAGS manual, which includes a list of all the distributions JAGS understands, along with the code for calling them (see Chapter 6):

http://people.math.aau.dk/~kkb/Undervisning/Bayes13/sorenh/docs/jags_user_manual.pdf

P.S. Sorry none of these links are clickable. WordPress is refusing to play nicely with others. When it’s sorted out, I’ll go back and update this post with real links.

Stopping rules in NHST and Bayes

(Sorry  – I wrote this post after our last BayesClub meeting, but I forgot to actually post it! Better late than never, I hope.)

Let’s say you’re collecting data and you check on your results every once in a while as the data are coming in. You decide to stop collecting data once you see that the results look “good”, and you report your findings using a standard NHST test.  Then the Stats Police break down your door, handcuff you, and throw you in the back of an armored van that takes you to Stats Prison. A common tale, but true.

But what happens if you’re analyzing your data using Bayesian inference instead of NHST? We know that multiple comparisons aren’t a problem when you’re dealing with a posterior distribution instead of running NHST tests, so maybe it’s also okay to run your analyses multiple times and then stop collecting data whenever you’re satisfied. 

Turns out, no. If your stopping rule is based on your results, your results will be biased, no matter what flavor stats you’re running. 

http://datacolada.org/2014/01/13/13-posterior-hacking/

One thing to note is that Simonsohn is using the Bayes Factor to get a result analogous to a significance test in his Bayesian analyses, which operates differently from the HDI and ROPE method we’ve been reading about in Kruschke’s work. Kruschke actually has a blog post on this as well, in which he discusses stopping rules with NHST, Bayes Factors, and HDIs in more detail:

http://doingbayesiandataanalysis.blogspot.com/2013/11/optional-stopping-in-data-collection-p.html 

Additional installation instructions for Mac users

I have a Mac, so I figured I would post some notes on software installation specific to Mac users, from my own experience, to supplement Jacob’s instructions here. Regardless of which operating system you’re using, I encourage you to run the test models in step 4 to check that everything has installed correctly. If possible, please do this before Tuesday, so we have a chance to troubleshoot with you before class if you run into trouble.

1. Install R

Mac users click here (for PC and Linux, see the links here).

This downloads the installation package file. Click on it in your downloads folder to open the file. This opens the install wizard thingy.

Read the stuff and click “Continue” a bunch.

Check that it worked. Either click on the R logo in your Applications folder to launch, or open a terminal and type “r”, then hit enter. You should see something like this:

R version 3.0.2 (2013-09-25) -- "Frisbee Sailing" Copyright (C) 2013 The R Foundation for Statistical Computing Platform: x86_64-apple-darwin10.8.0 (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. Natural language support but running in an English locale R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. 

2. Install JAGS

Mac users click here

This downloads the installation package. Click on it in your downloads folder to open, and then click on the package installer that appears in Finder (“JAGS 3.3.0.pkg”). This opens the install wizard thingy.

Read the stuff and click “Continue” a bunch.

Check that it worked. You will NOT see an application called JAGS in your applications folder. JAGS just runs in the background and talks to R. To check that it installed correctly, open the terminal and type “jags”, then hit enter. You should see something like this:

Welcome to JAGS 3.3.0 on Sat Jan 11 01:29:42 2014 JAGS is free software and comes with ABSOLUTELY NO WARRANTY Loading module: basemod: ok Loading module: bugs: ok 

If you’re not comfortable using the terminal, you can skip this and just check that all three programs are working together at the end (step 4).

3. Install rjags (and R2jags, if you like)

You install rjags (and R2jags) from within R. Open R, and then type

install.packages("rjags", dependencies=TRUE) 
install.packages("R2jags", dependencies=TRUE) 

and hit enter after each. If it asks you to select a CRAN mirror, pick “USA (OR)”. This will run installation scripts.
Check that it worked. In R, type

library(rjags) 

You should see something like this:

Loading required package: coda Loading required package: lattice Linked to JAGS 3.3.0 Loaded modules: basemod,bugs 

4. Check that everything is working.

Run a test script for one or two simple models.

Test 1: Download the program files from Kruschke’s book from here. You will need to set the working directory for R to that folder using setwd().

Open the file “BESTexample.R”. Select all of the text, and run it. It may take a little while. At the end, you should see several pretty graphs pop up.

Test 2: Another test, using R2jags instead of rjags (Note that for this script to work, you have to run “install.packages(“R2jags”, dependencies=TRUE)” first, if you haven’t already):

Download the R script (“example jags.r”) and JAGS model file (“Rate 1.txt”) in this folder, taken from the Lee & Wagenmakers text. You will need to edit the R script to set your working directory to the folder where the JAGS script is saved.
Open the R script, select all, and run. If it runs correctly, you should see a histogram of the parameter “theta” pop up.

If you’re getting an error about x11, you may need to install xquartz (http://xquartz.macosforge.org/landing/). Do that, and then try running it again.

If you can’t get at least one of these models to run, please email Rose or Jacob right away. Sometimes it’s tricky to get these programs to work. For this reason, it is a really good idea NOT to wait until 10 minutes before we meet to start installing software.

Welcome to Winter

Hello All,

Bayes Club will meet for the first time on Tuesday, 2014-01-14, and on even weeks after that (alternating with R Club). How exciting!

In our first meeting, before diving into “How Bayes?”, we will be discussing “Why Bayes?” If possible, by Tuesday, please take a look at the following reading(s), which can be found on the Syllabus page:

  • Kruschke, J. K. (2010). Bayesian data analysis. Wiley Interdisciplinary Reviews: Cognitive Science, 1(5), 658Ð676. doi:10.1002/wcs.72 [pdf]
  • If you have time: A practical and pithy post by Andrew Gelman on the practice of Bayesian stats in psychology, with links to several relevant articles and discussion.

Rose wrote up an excellent introduction to the logistics of the term in an R Club post here. Everything said in that post applies to Bayes Club, so take a look, especially if you’ve signed up for credit for the course.

As mentioned in Rose’s post above, we will unfortunately not be meeting in a computer lab this term. Thus, if possible, please bring a laptop with you to our meetings.

The code examples that we’ll be discussing this term require the use of three pieces of software: R, JAGS (“Just Another Gibbs Sampler”), and rjags, an R package that allows R and JAGS to speak to one another. Please try to install these by Tuesday the 14th, or at least by our following meeting. I’ve written up a post on the installation process here for you; please feel free to comment on it to bring up any questions or difficulties that you experience, so that we can all try to figure out answers together.

Looking forward to seeing you! I have good reason to believe that we’ll have a credibly good time*!

* If you don’t understand that weak joke yet, then all learning objectives for the course will be met if you do by Finals Week.

Getting Set Up with Software

Throughout the term, we will be discussing code samples from two complimentary software packages / languages: R and JAGS (“Just Another Gibbs Sampler”). It would be helpful to have both of these up and running on your personal computers before we dive in (i.e., by Week 2, if possible). For the most part, setup of both is straightforward; most of you likely have R set up already. Having said that, I’ve run into enough trouble installing JAGS in the past (albeit in Linux) that I’m writing to point out some resources.

John Kruschke, from whom we will be reading for our week 2 meeting, has the best guide that I’ve found for installing the necessary packages. That guide is here. I recommend following it. In short, you’ll need to install three components:

  1. R (and, if you like, an IDE such as RStudio)
  2. JAGS, from the project’s SourceForge page
    1. For Linux, JAGS is also available from major distros’ repositories.
    2. JAGS has released three versions. If given the choice, I recommend installing JAGS3, since it’s most up-to-date.
  3. rjags, an R package that lets JAGS and R speak to each other. To install this, launch R, then run install.packages('rjags').

Notes on the installation process:

  • Make sure that R and JAGS are both working before attempting to install rjags. To do this, open a terminal, run R (i.e., type R, and press enter), and make sure that R loads (to exit, type quit()). Then try running jags and make sure that it launches, as well (to exit, you can use Ctrl+C).
  • Potential problems that may be unique to Linux (and, within Linux, may even be unique to openSUSE): If you try to install rjags and are getting error messages about files not being found, one of a few things is most likely wrong:
    • You don’t have a C++ compiler installed.
    • If you get an error message about a file called “console.h” not being available, you may need to install the jags-devel package in addition to the jags package.
    • You are not installing JAGS with superuser privileges (sudo R).
    • You may want to look at the post here, if all else has failed.

Feel free to comment on this post if you have questions or other tips to add. Good luck!

UPDATE: Rose has written up helpful additional Mac-specific instructions (as well as some additional instructions that should work cross-platform) here. Take a look at those, as well.