Psych classes that already use R

What would happen if the psych department only taught stats in R? What would change? Well, here’s a list of methods classes that are already taught in R:

  • SEM – R with the lavaan package
  • MLM – R with lme4 and nlne
  • Grad Stats – both R and SPSS
  • Meta-analysis – R with metafor
  • Research methods – R demos
  • Undergrad research methods (303) – TAs can choose to use R

Final Winter R Club Meeting

Tomorrow will be the last meeting of the term.

Remember that all those enrolled for credit will need to write up a final “What I Learned in R Club” statement.

This will also be a good opportunity for deciding on the direction we take next term. I have some ideas I’d like to propose and discuss. They all revolve around the basic idea that there is a high demand for instruction in some basic tasks that are easy in R and hard (or worse) otherwise.

How can we make R Club the best resource it can be? I think we can start by identifying core skills that people aren’t learning other places, and that are doable easily and efficiently in R; and by identifying barriers to using R.

Here’s a list I’ve been compiling to get us started:

  • R Syntax, or “Wtf is the dollar sign for?”
  • Programming basics
  • Cleaning, Joining, Reshaping
    • e.g., dwell time ratings with epochs
  • Overcoming the spreadsheet
    • Qualtrics data checking – how to solve mis-coding
    • Finding mistakes in factors etc (using dplyr)
  • Writing reproducible analyses
  • GGlorious Ploting
    • EDA and visual data checking
  • Document Rendering (pdf, html, docx on Windows, OS X, Linux)
  • Tables (correlation matrices with stat tests)

Fun methods:

  • MLM
  • Topic modeling
  • Social network analysis
  • Big data techniques (e.g., scraping with Rvest)
  • CFA

Some good coursework resources:

  • http://stat545-ubc.github.io/
  • http://simplystatistics.org/2016/02/12/not-so-standard-deviations-episode-9-spreadsheet-drama/

No R Club today, but feel free to meet anyway!

Sorry for the last minute notice, but there will be no official R Club Today (3/1/16). However, if folks would still like to meet the space will be open!

See you next week!

Written by Comments Off on No R Club today, but feel free to meet anyway! Posted in Uncategorized

Big Data Repository

For those interested, some researchers at Purdue are creating a repository for publicly available data. It sounds like they’re pretty early in the process, but there might be opportunities for collaboration (or just to access the data on your own).

The full announcement (which I got via the SPSP listserv) is below:

****************

We are a research team consisting of two psychologists and two computer scientists at Purdue University and University of Pennsylvania, working on a “big data research infrastructure” project called HUBALIVE.

Our vision is to help researchers interested in human behavior observations to collect and analyze their data in a way that is 1) bigger, 2) easier, 3) faster, and 4) less expensive. The HUBALIVE project aims to allow researchers in social and organizational sciences to access a large volume of video, image, and text data that are publicly available around the world (e.g., 70,000 live video streams with associated metadata; for more info, visit: www.hubalive.org). To this end, our online data repository and accompanying analytic tools called “HUBALIVE” will be made available free-of-charge for research purposes.

At this juncture, we are interested in gauging the initial level of interest in (free) access to such data in the field of social and organizational sciences. We are also interested in identifying scholars who might be interested in collaborating with us for various research opportunities. If you are one of them, please visit the following link to take a 1-minute survey and register for more information and updates: purdue.qualtrics.com/jfe/form/SV_43lnH4OGa5BQTRz

Please let us know if you have any questions or thoughts to share:hubalive@purdue.edu.

We would very much appreciate you forwarding this to those who may be interested.

Thank you,

Sang Eun Woo
Assistant Professor
Department of Psychological Sciences
Purdue University
www.purdue.edu/hhs/psy/about/directory/faculty/…

Yung-Hsiang Lu
Associate Professor
Electrical and Computer Engineering
Purdue University
engineering.purdue.edu/HELPS

Lyle H. Ungar
Professor
Department of Computer and Information Science
University of Pennsylvania
www.cis.upenn.edu/~ungar

Louis Tay
Assistant Professor
Department of Psychological Sciences
Purdue University
www.purdue.edu/hhs/psy/about/directory/faculty/…

Notebook: Playing with the Yelp Challenge data








Playing with the Yelp Challenge data

This is a working lab notebook from this ongoing project, so it may be messy at times.

What’s our goal here? Some of us might want to one day work in industry where words like JSON, MySQL (or NoSQL), and Machine Learning are commonplace. In order to add to our overflowing skillsets (numchuku skills, bow hunting skills, computer hacking skills…) we have set ourselves a quest: analyze the Yelp Challenge data.

This is how we’ll progress:

  1. Determine how best to access the data from R
    • most likely a database backend – these files are huge
  2. Determine what questions we can ask
    • and how to ask them – traditional stats? machine learning? a mix?
  3. Implement the analysis
    • do we need to sample and average model results across samples? can we do any all-at-once analysis on such a large dataset?

By the time we’re finished, we hope to have a better understanding of what research and analysis (that is, data science) looks like in the world of (not-)for-profit corporations.

Using MongoDB with R for large data

Note: I’m using MongoDB because it’s very straightforward to inject with the Yelp data, and I don’t want to spend a lot of time on data wrangling. However, there are a lot of drawbacks to MongoDB (this is a great post), and if I were developing something long term, I’d probably go with a SQL backend and dplyr.

First, make sure you install MongoDB according to your OS instructions – you can find them all in the sidebar of the MongoDB Community Edition page. Skip down a bit to see what issues we encountered during this whole process.

You’ll also want to install the mongolite package using

install.packages('mongolite')

Once you’ve done that, download the Yelp Challenge dataset we’ll be using (contact John for a link or go to the Yelp Challenge homepage). This is a great time to read up on some of the challenge goals on that website.

Make sure you’ve started the database with by using the mongo program from your terminal (type ‘exit’ if it does indeed work; if it doesn’t, try running mongod). MongoDB requires a writable directory at /data/db/.

It is supposedly incredibly easy to import the json formatted Yelp data to MongoDB with the following code. Run these commands in your terminal window:

mongoimport --db yelp --collection businesses yelp_phoenix_academic_dataset/yelp_academic_dataset_business.json
mongoimport --db yelp --collection users yelp_phoenix_academic_dataset/yelp_academic_dataset_user.json
mongoimport --db yelp --collection reviews yelp_phoenix_academic_dataset/yelp_academic_dataset_review.json
mongoimport --db yelp --collection checkins yelp_phoenix_academic_dataset/yelp_academic_dataset_checkin.json

Here’s a breakdown of the above command: mongoimport is the program that imports the data, --db yelp specifies that the database to be added to is called “yelp”, --collection businesses specifies that the collection within the yelp database that should hold the data is called “businesses”, and finally, yelp_phoenix_academic_dataset/yelp_academic_dataset_business.json points to the file to be imported, called “yelp_academic_dataset_business.json” that resides in the folder “yelp_phoenix_academic_dataset”.

In R, you can now access the data you’ve just imported:

library(mongolite)
m <- mongo(collection='business', db='yelp')

m$count()
## [1] 61184

Some notes that might help you troubleshoot as you install and access MongoDB:

  • make sure you have /data/db set with proper permissions
  • set a path to the binaries so that you can run mongoimport from the directory you’ve downloaded the yelp data into (this is primarily a windows issues)
  • know how to run the mongoimport commands from the terminal (if you’re not familiar with your OS’s terminal program, that’s an additional thing to learn before installing the DB)
  • ensure that mongod is running (use mongo from the command line in your terminal program)

If you want to abandon MongoDB and just read in JSON Directly

## importing Yelp data using jsonlite
setwd("~/Desktop") # set working directory

library('jsonlite') # load jasonlite library

yelp_reviews <- stream_in(file("yelp_dataset_challenge_academic_dataset/yelp_academic_dataset_review.json"),flatten = TRUE) # import review data (flatten argument prevents nesting)

biz <- stream_in(file("yelp_dataset_challenge_academic_dataset/yelp_academic_dataset_business.json"),flatten = TRUE) # import business data (flatten argument prevents nesting)

id <- "e_U_FnpdKVgNb4mUN2cU_Q" # id is a variable that represents a single entry in biz$business_id (the unique identifier of the buisiness in the Yelp database)

biz[biz$business_id== id,c("name","city")] # indexes the business variable to search for the business id provided previously, and returns the name of the business, as well as the city
review.collection <- mongo(collection='reviews', db='yelp')
sampleSize <- round(review.collection$count()*.045)
review.collection$index()
##   v _id name           ns
## 1 1   1 _id_ yelp.reviews
#    m$aggregate('[{"$group":{"_id":"$carrier", "count": {"$sum":1}, "average":{"$avg":"$distance"}}}]')
aSample <- review.collection$aggregate(
paste0('[ 
  { "$project" : { "_id" : 1, "text" : 1 , "business_id" : 1, "stars" : 1, "date" : 1} },
  { "$sample" : { "size" : ', sampleSize, ' } } 
]'))
## 
 Found 1000 records...
 Found 2000 records...
 Found 3000 records...
 Found 4000 records...
 Found 5000 records...
 Found 6000 records...
 Found 7000 records...
 Found 8000 records...
 Found 9000 records...
 Found 10000 records...
 Found 11000 records...
 Found 12000 records...
 Found 13000 records...
 Found 14000 records...
 Found 15000 records...
 Found 16000 records...
 Found 17000 records...
 Found 18000 records...
 Found 19000 records...
 Found 20000 records...
 Found 21000 records...
 Found 22000 records...
 Found 23000 records...
 Found 24000 records...
 Found 25000 records...
 Found 26000 records...
 Found 27000 records...
 Found 28000 records...
 Found 29000 records...
 Found 30000 records...
 Found 31000 records...
 Found 32000 records...
 Found 33000 records...
 Found 34000 records...
 Found 35000 records...
 Found 36000 records...
 Found 37000 records...
 Found 38000 records...
 Found 39000 records...
 Found 40000 records...
 Found 41000 records...
 Found 42000 records...
 Found 43000 records...
 Found 44000 records...
 Found 45000 records...
 Found 46000 records...
 Found 47000 records...
 Found 48000 records...
 Found 49000 records...
 Found 50000 records...
 Found 51000 records...
 Found 52000 records...
 Found 53000 records...
 Found 54000 records...
 Found 55000 records...
 Found 56000 records...
 Found 57000 records...
 Found 58000 records...
 Found 59000 records...
 Found 60000 records...
 Found 61000 records...
 Found 62000 records...
 Found 63000 records...
 Found 64000 records...
 Found 65000 records...
 Found 66000 records...
 Found 67000 records...
 Found 68000 records...
 Found 69000 records...
 Found 70000 records...
 Found 70617 records...
 Imported 70617 records. Simplifying into dataframe...
save(aSample, file='currentReviewSample.RData')
load(file='currentReviewSample.RData')

format(object.size(aSample), units='MB')
## [1] "56.7 Mb"
library(stm)

#     textProcessor(documents, metadata=NULL, 
#                   lowercase=TRUE, removestopwords=TRUE, removenumbers=TRUE, 
#                   removepunctuation=TRUE, stem=TRUE, wordLengths=c(3,Inf), 
#                   sparselevel=1, language="en", 
#                   verbose=TRUE, onlycharacter= FALSE, striphtml=FALSE,
#                   customstopwords=NULL, onlytxtfiles=TRUE) 
processedThing <- textProcessor(aSample$text, sparselevel=1, verbose=T)
## Building corpus... 
## Converting to Lower Case... 
## Removing stopwords... 
## Removing numbers... 
## Removing punctuation... 
## Stemming... 
## Creating Output...
prepped <- prepDocuments(processedThing$documents, processedThing$vocab, processedThing$meta)
## Removing 42073 of 70421 terms (42073 of 3509486 tokens) due to frequency 
## Removing 9 Documents with No Words 
## Your corpus now has 70595 documents, 28348 terms and 3467413 tokens.
aModel <- stm(processedThing$documents, processedThing$vocab, K=10)
## Beginning Initialization.
## ....................................................................................................
## Completed E-Step (47 seconds). 
## Completed M-Step. 
## Completing Iteration 1 (approx. per word bound = -7.438) 
## ....................................................................................................
## Completed E-Step (40 seconds). 
## Completed M-Step. 
## Completing Iteration 2 (approx. per word bound = -7.434, relative change = 5.904e-04) 
## ....................................................................................................
## Completed E-Step (39 seconds). 
## Completed M-Step. 
## Completing Iteration 3 (approx. per word bound = -7.429, relative change = 6.423e-04) 
## ....................................................................................................
## Completed E-Step (39 seconds). 
## Completed M-Step. 
## Completing Iteration 4 (approx. per word bound = -7.424, relative change = 6.067e-04) 
## ....................................................................................................
## Completed E-Step (38 seconds). 
## Completed M-Step. 
## Completing Iteration 5 (approx. per word bound = -7.420, relative change = 5.283e-04) 
## Topic 1: place, like, just, one, good 
##  Topic 2: great, place, drink, friend, bar 
##  Topic 3: room, stay, show, vega, hotel 
##  Topic 4: get, look, store, need, shop 
##  Topic 5: pizza, salad, delici, perfect, restaur 
##  Topic 6: wait, ask, back, minut, said 
##  Topic 7: good, also, best, price, like 
##  Topic 8: year, staff, time, will, work 
##  Topic 9: order, food, good, chicken, fri 
##  Topic 10: food, place, time, tri, good 
## ....................................................................................................
## Completed E-Step (38 seconds). 
## Completed M-Step. 
## Completing Iteration 6 (approx. per word bound = -7.417, relative change = 4.873e-04) 
## ....................................................................................................
## Completed E-Step (38 seconds). 
## Completed M-Step. 
## Completing Iteration 7 (approx. per word bound = -7.413, relative change = 4.690e-04) 
## ....................................................................................................
## Completed E-Step (38 seconds). 
## Completed M-Step. 
## Completing Iteration 8 (approx. per word bound = -7.410, relative change = 4.575e-04) 
## ....................................................................................................
## Completed E-Step (37 seconds). 
## Completed M-Step. 
## Completing Iteration 9 (approx. per word bound = -7.407, relative change = 4.434e-04) 
## ....................................................................................................
## Completed E-Step (37 seconds). 
## Completed M-Step. 
## Completing Iteration 10 (approx. per word bound = -7.404, relative change = 4.305e-04) 
## Topic 1: place, like, just, one, good 
##  Topic 2: great, place, drink, friend, good 
##  Topic 3: room, stay, show, hotel, vega 
##  Topic 4: get, look, store, need, shop 
##  Topic 5: pizza, salad, delici, steak, dessert 
##  Topic 6: wait, ask, back, said, minut 
##  Topic 7: also, good, best, price, vega 
##  Topic 8: year, staff, time, day, will 
##  Topic 9: order, good, food, chicken, fri 
##  Topic 10: food, place, time, tri, good 
## ....................................................................................................
## Completed E-Step (37 seconds). 
## Completed M-Step. 
## Completing Iteration 11 (approx. per word bound = -7.400, relative change = 4.183e-04) 
## ....................................................................................................
## Completed E-Step (37 seconds). 
## Completed M-Step. 
## Completing Iteration 12 (approx. per word bound = -7.397, relative change = 4.044e-04) 
## ....................................................................................................
## Completed E-Step (36 seconds). 
## Completed M-Step. 
## Completing Iteration 13 (approx. per word bound = -7.395, relative change = 3.890e-04) 
## ....................................................................................................
## Completed E-Step (36 seconds). 
## Completed M-Step. 
## Completing Iteration 14 (approx. per word bound = -7.392, relative change = 3.671e-04) 
## ....................................................................................................
## Completed E-Step (36 seconds). 
## Completed M-Step. 
## Completing Iteration 15 (approx. per word bound = -7.389, relative change = 3.387e-04) 
## Topic 1: like, place, just, one, littl 
##  Topic 2: great, place, friend, drink, love 
##  Topic 3: room, stay, show, hotel, vega 
##  Topic 4: look, get, store, shop, need 
##  Topic 5: pizza, salad, steak, delici, dessert 
##  Topic 6: wait, ask, back, said, minut 
##  Topic 7: also, best, vega, price, amaz 
##  Topic 8: year, staff, time, work, day 
##  Topic 9: order, good, chicken, fri, food 
##  Topic 10: food, place, time, tri, good 
## ....................................................................................................
## Completed E-Step (36 seconds). 
## Completed M-Step. 
## Completing Iteration 16 (approx. per word bound = -7.387, relative change = 3.084e-04) 
## ....................................................................................................
## Completed E-Step (36 seconds). 
## Completed M-Step. 
## Completing Iteration 17 (approx. per word bound = -7.385, relative change = 2.790e-04) 
## ....................................................................................................
## Completed E-Step (38 seconds). 
## Completed M-Step. 
## Completing Iteration 18 (approx. per word bound = -7.383, relative change = 2.525e-04) 
## ....................................................................................................
## Completed E-Step (37 seconds). 
## Completed M-Step. 
## Completing Iteration 19 (approx. per word bound = -7.381, relative change = 2.288e-04) 
## ....................................................................................................
## Completed E-Step (37 seconds). 
## Completed M-Step. 
## Completing Iteration 20 (approx. per word bound = -7.380, relative change = 2.083e-04) 
## Topic 1: like, place, just, one, littl 
##  Topic 2: great, place, friend, love, drink 
##  Topic 3: room, stay, show, hotel, vega 
##  Topic 4: look, get, store, shop, need 
##  Topic 5: pizza, salad, steak, dessert, delici 
##  Topic 6: wait, ask, back, got, said 
##  Topic 7: also, best, vega, amaz, price 
##  Topic 8: year, time, staff, work, day 
##  Topic 9: order, chicken, good, fri, burger 
##  Topic 10: food, place, time, good, tri 
## ....................................................................................................
## Completed E-Step (38 seconds). 
## Completed M-Step. 
## Completing Iteration 21 (approx. per word bound = -7.379, relative change = 1.910e-04) 
## ....................................................................................................
## Completed E-Step (36 seconds). 
## Completed M-Step. 
## Completing Iteration 22 (approx. per word bound = -7.377, relative change = 1.756e-04) 
## ....................................................................................................
## Completed E-Step (33 seconds). 
## Completed M-Step. 
## Completing Iteration 23 (approx. per word bound = -7.376, relative change = 1.620e-04) 
## ....................................................................................................
## Completed E-Step (34 seconds). 
## Completed M-Step. 
## Completing Iteration 24 (approx. per word bound = -7.375, relative change = 1.500e-04) 
## ....................................................................................................
## Completed E-Step (34 seconds). 
## Completed M-Step. 
## Completing Iteration 25 (approx. per word bound = -7.374, relative change = 1.395e-04) 
## Topic 1: like, just, place, one, littl 
##  Topic 2: great, place, friend, love, drink 
##  Topic 3: room, stay, show, hotel, vega 
##  Topic 4: store, look, get, shop, need 
##  Topic 5: pizza, dessert, salad, steak, delici 
##  Topic 6: wait, ask, back, got, servic 
##  Topic 7: best, also, vega, amaz, top 
##  Topic 8: year, time, work, staff, will 
##  Topic 9: order, chicken, fri, good, burger 
##  Topic 10: food, place, good, time, tri 
## ....................................................................................................
## Completed E-Step (34 seconds). 
## Completed M-Step. 
## Completing Iteration 26 (approx. per word bound = -7.373, relative change = 1.300e-04) 
## ....................................................................................................
## Completed E-Step (381 seconds). 
## Completed M-Step. 
## Completing Iteration 27 (approx. per word bound = -7.372, relative change = 1.217e-04) 
## ....................................................................................................
## Completed E-Step (34 seconds). 
## Completed M-Step. 
## Completing Iteration 28 (approx. per word bound = -7.371, relative change = 1.143e-04) 
## ....................................................................................................
## Completed E-Step (33 seconds). 
## Completed M-Step. 
## Completing Iteration 29 (approx. per word bound = -7.370, relative change = 1.076e-04) 
## ....................................................................................................
## Completed E-Step (34 seconds). 
## Completed M-Step. 
## Completing Iteration 30 (approx. per word bound = -7.370, relative change = 1.014e-04) 
## Topic 1: like, just, place, one, littl 
##  Topic 2: great, place, friend, love, drink 
##  Topic 3: room, stay, show, hotel, vega 
##  Topic 4: store, look, shop, get, need 
##  Topic 5: pizza, dessert, salad, steak, delici 
##  Topic 6: wait, ask, back, got, order 
##  Topic 7: best, also, vega, amaz, top 
##  Topic 8: year, time, work, staff, will 
##  Topic 9: order, chicken, fri, burger, good 
##  Topic 10: food, place, good, time, tri 
## ....................................................................................................
## Completed E-Step (34 seconds). 
## Completed M-Step. 
## Completing Iteration 31 (approx. per word bound = -7.369, relative change = 9.572e-05) 
## ....................................................................................................
## Completed E-Step (33 seconds). 
## Completed M-Step. 
## Completing Iteration 32 (approx. per word bound = -7.368, relative change = 9.067e-05) 
## ....................................................................................................
## Completed E-Step (33 seconds). 
## Completed M-Step. 
## Completing Iteration 33 (approx. per word bound = -7.368, relative change = 8.616e-05) 
## ....................................................................................................
## Completed E-Step (33 seconds). 
## Completed M-Step. 
## Completing Iteration 34 (approx. per word bound = -7.367, relative change = 8.203e-05) 
## ....................................................................................................
## Completed E-Step (32 seconds). 
## Completed M-Step. 
## Completing Iteration 35 (approx. per word bound = -7.366, relative change = 7.812e-05) 
## Topic 1: like, just, place, one, littl 
##  Topic 2: great, place, friend, love, drink 
##  Topic 3: room, stay, show, hotel, vega 
##  Topic 4: store, shop, get, look, car 
##  Topic 5: pizza, dessert, salad, steak, delici 
##  Topic 6: wait, ask, back, got, order 
##  Topic 7: best, also, amaz, vega, top 
##  Topic 8: year, time, work, staff, will 
##  Topic 9: order, chicken, fri, burger, sandwich 
##  Topic 10: food, good, place, time, tri 
## ....................................................................................................
## Completed E-Step (32 seconds). 
## Completed M-Step. 
## Completing Iteration 36 (approx. per word bound = -7.366, relative change = 7.470e-05) 
## ....................................................................................................
## Completed E-Step (32 seconds). 
## Completed M-Step. 
## Completing Iteration 37 (approx. per word bound = -7.365, relative change = 7.178e-05) 
## ....................................................................................................
## Completed E-Step (32 seconds). 
## Completed M-Step. 
## Completing Iteration 38 (approx. per word bound = -7.365, relative change = 6.929e-05) 
## ....................................................................................................
## Completed E-Step (32 seconds). 
## Completed M-Step. 
## Completing Iteration 39 (approx. per word bound = -7.364, relative change = 6.701e-05) 
## ....................................................................................................
## Completed E-Step (33 seconds). 
## Completed M-Step. 
## Completing Iteration 40 (approx. per word bound = -7.364, relative change = 6.486e-05) 
## Topic 1: like, just, place, one, littl 
##  Topic 2: great, place, friend, love, drink 
##  Topic 3: room, stay, show, hotel, vega 
##  Topic 4: store, shop, get, look, car 
##  Topic 5: pizza, dessert, salad, steak, delici 
##  Topic 6: wait, ask, back, got, order 
##  Topic 7: best, amaz, also, vega, top 
##  Topic 8: year, time, work, will, staff 
##  Topic 9: order, chicken, fri, burger, sandwich 
##  Topic 10: food, good, place, tri, time 
## ....................................................................................................
## Completed E-Step (36 seconds). 
## Completed M-Step. 
## Completing Iteration 41 (approx. per word bound = -7.363, relative change = 6.265e-05) 
## ....................................................................................................
## Completed E-Step (34 seconds). 
## Completed M-Step. 
## Completing Iteration 42 (approx. per word bound = -7.363, relative change = 6.037e-05) 
## ....................................................................................................
## Completed E-Step (34 seconds). 
## Completed M-Step. 
## Completing Iteration 43 (approx. per word bound = -7.363, relative change = 5.805e-05) 
## ....................................................................................................
## Completed E-Step (31 seconds). 
## Completed M-Step. 
## Completing Iteration 44 (approx. per word bound = -7.362, relative change = 5.570e-05) 
## ....................................................................................................
## Completed E-Step (32 seconds). 
## Completed M-Step. 
## Completing Iteration 45 (approx. per word bound = -7.362, relative change = 5.328e-05) 
## Topic 1: like, just, place, one, littl 
##  Topic 2: great, place, love, friend, drink 
##  Topic 3: room, stay, show, hotel, vega 
##  Topic 4: store, shop, get, look, car 
##  Topic 5: pizza, dessert, salad, steak, delici 
##  Topic 6: wait, ask, back, got, order 
##  Topic 7: best, amaz, also, vega, top 
##  Topic 8: year, time, work, will, staff 
##  Topic 9: order, chicken, fri, burger, sandwich 
##  Topic 10: food, good, place, tri, time 
## ....................................................................................................
## Completed E-Step (31 seconds). 
## Completed M-Step. 
## Completing Iteration 46 (approx. per word bound = -7.361, relative change = 5.076e-05) 
## ....................................................................................................
## Completed E-Step (31 seconds). 
## Completed M-Step. 
## Completing Iteration 47 (approx. per word bound = -7.361, relative change = 4.818e-05) 
## ....................................................................................................
## Completed E-Step (31 seconds). 
## Completed M-Step. 
## Completing Iteration 48 (approx. per word bound = -7.361, relative change = 4.556e-05) 
## ....................................................................................................
## Completed E-Step (31 seconds). 
## Completed M-Step. 
## Completing Iteration 49 (approx. per word bound = -7.360, relative change = 4.310e-05) 
## ....................................................................................................
## Completed E-Step (31 seconds). 
## Completed M-Step. 
## Completing Iteration 50 (approx. per word bound = -7.360, relative change = 4.084e-05) 
## Topic 1: like, just, place, one, littl 
##  Topic 2: great, place, love, friend, drink 
##  Topic 3: room, stay, hotel, show, vega 
##  Topic 4: store, shop, get, look, car 
##  Topic 5: pizza, dessert, salad, steak, cream 
##  Topic 6: wait, ask, back, got, order 
##  Topic 7: best, amaz, also, vega, top 
##  Topic 8: year, time, work, will, staff 
##  Topic 9: order, chicken, fri, burger, sandwich 
##  Topic 10: food, good, place, tri, time 
## ....................................................................................................
## Completed E-Step (31 seconds). 
## Completed M-Step. 
## Completing Iteration 51 (approx. per word bound = -7.360, relative change = 3.879e-05) 
## ....................................................................................................
## Completed E-Step (31 seconds). 
## Completed M-Step. 
## Completing Iteration 52 (approx. per word bound = -7.360, relative change = 3.709e-05) 
## ....................................................................................................
## Completed E-Step (32 seconds). 
## Completed M-Step. 
## Completing Iteration 53 (approx. per word bound = -7.359, relative change = 3.568e-05) 
## ....................................................................................................
## Completed E-Step (30 seconds). 
## Completed M-Step. 
## Completing Iteration 54 (approx. per word bound = -7.359, relative change = 3.460e-05) 
## ....................................................................................................
## Completed E-Step (30 seconds). 
## Completed M-Step. 
## Completing Iteration 55 (approx. per word bound = -7.359, relative change = 3.361e-05) 
## Topic 1: like, just, place, one, can 
##  Topic 2: great, place, friend, love, drink 
##  Topic 3: room, stay, hotel, show, vega 
##  Topic 4: store, shop, get, look, car 
##  Topic 5: pizza, dessert, salad, steak, cream 
##  Topic 6: wait, ask, back, got, order 
##  Topic 7: best, amaz, also, vega, top 
##  Topic 8: year, time, work, will, staff 
##  Topic 9: order, chicken, fri, burger, sandwich 
##  Topic 10: food, good, place, tri, time 
## ....................................................................................................
## Completed E-Step (30 seconds). 
## Completed M-Step. 
## Completing Iteration 56 (approx. per word bound = -7.359, relative change = 3.268e-05) 
## ....................................................................................................
## Completed E-Step (30 seconds). 
## Completed M-Step. 
## Completing Iteration 57 (approx. per word bound = -7.358, relative change = 3.186e-05) 
## ....................................................................................................
## Completed E-Step (29 seconds). 
## Completed M-Step. 
## Completing Iteration 58 (approx. per word bound = -7.358, relative change = 3.126e-05) 
## ....................................................................................................
## Completed E-Step (30 seconds). 
## Completed M-Step. 
## Completing Iteration 59 (approx. per word bound = -7.358, relative change = 3.076e-05) 
## ....................................................................................................
## Completed E-Step (30 seconds). 
## Completed M-Step. 
## Completing Iteration 60 (approx. per word bound = -7.358, relative change = 3.036e-05) 
## Topic 1: like, just, place, one, can 
##  Topic 2: great, place, friend, love, drink 
##  Topic 3: room, stay, hotel, show, vega 
##  Topic 4: store, shop, get, look, car 
##  Topic 5: pizza, dessert, salad, steak, cream 
##  Topic 6: wait, ask, back, got, order 
##  Topic 7: best, amaz, also, vega, worth 
##  Topic 8: year, time, work, will, staff 
##  Topic 9: order, chicken, fri, burger, sandwich 
##  Topic 10: food, good, place, tri, time 
## ....................................................................................................
## Completed E-Step (29 seconds). 
## Completed M-Step. 
## Completing Iteration 61 (approx. per word bound = -7.357, relative change = 2.986e-05) 
## ....................................................................................................
## Completed E-Step (29 seconds). 
## Completed M-Step. 
## Completing Iteration 62 (approx. per word bound = -7.357, relative change = 2.944e-05) 
## ....................................................................................................
## Completed E-Step (30 seconds). 
## Completed M-Step. 
## Completing Iteration 63 (approx. per word bound = -7.357, relative change = 2.894e-05) 
## ....................................................................................................
## Completed E-Step (30 seconds). 
## Completed M-Step. 
## Completing Iteration 64 (approx. per word bound = -7.357, relative change = 2.837e-05) 
## ....................................................................................................
## Completed E-Step (29 seconds). 
## Completed M-Step. 
## Completing Iteration 65 (approx. per word bound = -7.357, relative change = 2.752e-05) 
## Topic 1: like, just, place, one, can 
##  Topic 2: great, place, friend, love, drink 
##  Topic 3: room, stay, hotel, show, vega 
##  Topic 4: store, shop, get, look, car 
##  Topic 5: pizza, dessert, steak, salad, cream 
##  Topic 6: wait, ask, back, got, order 
##  Topic 7: best, amaz, also, vega, worth 
##  Topic 8: year, time, work, will, staff 
##  Topic 9: order, chicken, fri, burger, sandwich 
##  Topic 10: food, good, place, tri, time 
## ....................................................................................................
## Completed E-Step (29 seconds). 
## Completed M-Step. 
## Completing Iteration 66 (approx. per word bound = -7.356, relative change = 2.664e-05) 
## ....................................................................................................
## Completed E-Step (29 seconds). 
## Completed M-Step. 
## Completing Iteration 67 (approx. per word bound = -7.356, relative change = 2.580e-05) 
## ....................................................................................................
## Completed E-Step (29 seconds). 
## Completed M-Step. 
## Completing Iteration 68 (approx. per word bound = -7.356, relative change = 2.500e-05) 
## ....................................................................................................
## Completed E-Step (29 seconds). 
## Completed M-Step. 
## Completing Iteration 69 (approx. per word bound = -7.356, relative change = 2.428e-05) 
## ....................................................................................................
## Completed E-Step (29 seconds). 
## Completed M-Step. 
## Completing Iteration 70 (approx. per word bound = -7.356, relative change = 2.368e-05) 
## Topic 1: like, just, place, one, can 
##  Topic 2: great, place, friend, love, drink 
##  Topic 3: room, stay, hotel, show, vega 
##  Topic 4: store, shop, car, get, look 
##  Topic 5: pizza, dessert, steak, cream, salad 
##  Topic 6: wait, ask, back, got, order 
##  Topic 7: best, amaz, also, vega, worth 
##  Topic 8: year, time, work, will, staff 
##  Topic 9: order, chicken, fri, burger, sandwich 
##  Topic 10: food, good, place, tri, time 
## ....................................................................................................
## Completed E-Step (29 seconds). 
## Completed M-Step. 
## Completing Iteration 71 (approx. per word bound = -7.355, relative change = 2.323e-05) 
## ....................................................................................................
## Completed E-Step (29 seconds). 
## Completed M-Step. 
## Completing Iteration 72 (approx. per word bound = -7.355, relative change = 2.291e-05) 
## ....................................................................................................
## Completed E-Step (29 seconds). 
## Completed M-Step. 
## Completing Iteration 73 (approx. per word bound = -7.355, relative change = 2.261e-05) 
## ....................................................................................................
## Completed E-Step (30 seconds). 
## Completed M-Step. 
## Completing Iteration 74 (approx. per word bound = -7.355, relative change = 2.228e-05) 
## ....................................................................................................
## Completed E-Step (29 seconds). 
## Completed M-Step. 
## Completing Iteration 75 (approx. per word bound = -7.355, relative change = 2.181e-05) 
## Topic 1: like, just, place, can, one 
##  Topic 2: great, place, friend, love, drink 
##  Topic 3: room, stay, hotel, show, vega 
##  Topic 4: store, shop, car, price, look 
##  Topic 5: pizza, dessert, steak, cream, salad 
##  Topic 6: wait, ask, back, got, order 
##  Topic 7: best, amaz, also, vega, worth 
##  Topic 8: year, time, work, will, staff 
##  Topic 9: order, chicken, fri, burger, sandwich 
##  Topic 10: food, good, place, tri, time 
## ....................................................................................................
## Completed E-Step (29 seconds). 
## Completed M-Step. 
## Completing Iteration 76 (approx. per word bound = -7.355, relative change = 2.118e-05) 
## ....................................................................................................
## Completed E-Step (28 seconds). 
## Completed M-Step. 
## Completing Iteration 77 (approx. per word bound = -7.354, relative change = 2.042e-05) 
## ....................................................................................................
## Completed E-Step (29 seconds). 
## Completed M-Step. 
## Completing Iteration 78 (approx. per word bound = -7.354, relative change = 1.969e-05) 
## ....................................................................................................
## Completed E-Step (28 seconds). 
## Completed M-Step. 
## Completing Iteration 79 (approx. per word bound = -7.354, relative change = 1.895e-05) 
## ....................................................................................................
## Completed E-Step (28 seconds). 
## Completed M-Step. 
## Completing Iteration 80 (approx. per word bound = -7.354, relative change = 1.822e-05) 
## Topic 1: like, just, place, can, one 
##  Topic 2: great, place, friend, love, drink 
##  Topic 3: room, stay, hotel, show, vega 
##  Topic 4: store, shop, car, price, look 
##  Topic 5: pizza, dessert, cream, steak, salad 
##  Topic 6: wait, ask, back, got, order 
##  Topic 7: best, amaz, also, vega, well 
##  Topic 8: year, time, work, will, staff 
##  Topic 9: order, chicken, fri, burger, sandwich 
##  Topic 10: food, good, place, tri, time 
## ....................................................................................................
## Completed E-Step (28 seconds). 
## Completed M-Step. 
## Completing Iteration 81 (approx. per word bound = -7.354, relative change = 1.761e-05) 
## ....................................................................................................
## Completed E-Step (30 seconds). 
## Completed M-Step. 
## Completing Iteration 82 (approx. per word bound = -7.354, relative change = 1.702e-05) 
## ....................................................................................................
## Completed E-Step (28 seconds). 
## Completed M-Step. 
## Completing Iteration 83 (approx. per word bound = -7.354, relative change = 1.656e-05) 
## ....................................................................................................
## Completed E-Step (29 seconds). 
## Completed M-Step. 
## Completing Iteration 84 (approx. per word bound = -7.354, relative change = 1.624e-05) 
## ....................................................................................................
## Completed E-Step (30 seconds). 
## Completed M-Step. 
## Completing Iteration 85 (approx. per word bound = -7.353, relative change = 1.598e-05) 
## Topic 1: like, just, place, can, one 
##  Topic 2: great, place, friend, love, drink 
##  Topic 3: room, stay, hotel, show, vega 
##  Topic 4: store, shop, car, price, look 
##  Topic 5: pizza, dessert, cream, steak, salad 
##  Topic 6: wait, ask, back, got, order 
##  Topic 7: best, amaz, also, vega, well 
##  Topic 8: work, time, year, will, staff 
##  Topic 9: order, chicken, fri, burger, sandwich 
##  Topic 10: food, good, place, tri, time 
## ....................................................................................................
## Completed E-Step (29 seconds). 
## Completed M-Step. 
## Completing Iteration 86 (approx. per word bound = -7.353, relative change = 1.572e-05) 
## ....................................................................................................
## Completed E-Step (27 seconds). 
## Completed M-Step. 
## Completing Iteration 87 (approx. per word bound = -7.353, relative change = 1.537e-05) 
## ....................................................................................................
## Completed E-Step (27 seconds). 
## Completed M-Step. 
## Completing Iteration 88 (approx. per word bound = -7.353, relative change = 1.503e-05) 
## ....................................................................................................
## Completed E-Step (27 seconds). 
## Completed M-Step. 
## Completing Iteration 89 (approx. per word bound = -7.353, relative change = 1.454e-05) 
## ....................................................................................................
## Completed E-Step (27 seconds). 
## Completed M-Step. 
## Completing Iteration 90 (approx. per word bound = -7.353, relative change = 1.388e-05) 
## Topic 1: like, just, place, can, get 
##  Topic 2: great, place, friend, drink, love 
##  Topic 3: room, stay, hotel, show, vega 
##  Topic 4: store, shop, price, car, look 
##  Topic 5: pizza, dessert, cream, steak, salad 
##  Topic 6: wait, ask, back, got, order 
##  Topic 7: best, amaz, also, vega, well 
##  Topic 8: work, time, year, will, staff 
##  Topic 9: order, chicken, fri, burger, sandwich 
##  Topic 10: food, good, place, time, tri 
## ....................................................................................................
## Completed E-Step (27 seconds). 
## Completed M-Step. 
## Completing Iteration 91 (approx. per word bound = -7.353, relative change = 1.339e-05) 
## ....................................................................................................
## Completed E-Step (27 seconds). 
## Completed M-Step. 
## Completing Iteration 92 (approx. per word bound = -7.353, relative change = 1.304e-05) 
## ....................................................................................................
## Completed E-Step (27 seconds). 
## Completed M-Step. 
## Completing Iteration 93 (approx. per word bound = -7.353, relative change = 1.276e-05) 
## ....................................................................................................
## Completed E-Step (26 seconds). 
## Completed M-Step. 
## Completing Iteration 94 (approx. per word bound = -7.353, relative change = 1.249e-05) 
## ....................................................................................................
## Completed E-Step (26 seconds). 
## Completed M-Step. 
## Completing Iteration 95 (approx. per word bound = -7.352, relative change = 1.217e-05) 
## Topic 1: like, just, place, can, get 
##  Topic 2: great, place, friend, drink, bar 
##  Topic 3: room, stay, hotel, show, vega 
##  Topic 4: store, shop, price, car, look 
##  Topic 5: pizza, dessert, cream, steak, salad 
##  Topic 6: wait, ask, back, got, order 
##  Topic 7: best, amaz, also, vega, well 
##  Topic 8: work, time, year, will, staff 
##  Topic 9: order, chicken, fri, burger, sandwich 
##  Topic 10: food, good, place, servic, time 
## ....................................................................................................
## Completed E-Step (26 seconds). 
## Completed M-Step. 
## Completing Iteration 96 (approx. per word bound = -7.352, relative change = 1.176e-05) 
## ....................................................................................................
## Completed E-Step (27 seconds). 
## Completed M-Step. 
## Completing Iteration 97 (approx. per word bound = -7.352, relative change = 1.130e-05) 
## ....................................................................................................
## Completed E-Step (26 seconds). 
## Completed M-Step. 
## Completing Iteration 98 (approx. per word bound = -7.352, relative change = 1.091e-05) 
## ....................................................................................................
## Completed E-Step (26 seconds). 
## Completed M-Step. 
## Completing Iteration 99 (approx. per word bound = -7.352, relative change = 1.054e-05) 
## ....................................................................................................
## Completed E-Step (26 seconds). 
## Completed M-Step. 
## Completing Iteration 100 (approx. per word bound = -7.352, relative change = 1.007e-05) 
## Topic 1: like, just, place, get, can 
##  Topic 2: great, place, friend, drink, bar 
##  Topic 3: room, stay, hotel, show, vega 
##  Topic 4: store, shop, price, car, look 
##  Topic 5: pizza, dessert, cream, steak, salad 
##  Topic 6: wait, ask, back, got, order 
##  Topic 7: best, amaz, also, vega, well 
##  Topic 8: work, time, year, will, staff 
##  Topic 9: order, chicken, fri, burger, sandwich 
##  Topic 10: food, good, place, servic, time 
## ....................................................................................................
## Completed E-Step (26 seconds). 
## Completed M-Step. 
## Model Converged
save(aModel, file='review_sample_topic_model.RData')
load(, file='review_sample_topic_model.RData')


labelTopics(aModel)
## Topic 1 Top Words:
##       Highest Prob: like, just, place, get, can, one, realli 
##       FREX: donut, coffe, sometim, east, might, perhap, chain 
##       Lift: bosa, abolut, aboslut, accessbas, accommid, aeroport, afaik 
##       Score: abimé, abond, abondant, accéder, accompagn, accompagné, accueilli 
## Topic 2 Top Words:
##       Highest Prob: great, place, friend, drink, bar, love, night 
##       FREX: beer, patio, tap, brew, draft, und, irish 
##       Lift: aaaall, aan, aangenaam, abbay, abdh, abendessen, abendstunden 
##       Score: abimé, abond, abondant, accéder, accompagn, accompagné, accueilli 
## Topic 3 Top Words:
##       Highest Prob: room, stay, hotel, show, vega, get, see 
##       FREX: room, hotel, pool, club, casino, bathroom, danc 
##       Lift: luxor, aaaaaaaad, aahh, aall, aawwweeessoommeee, abba, abccom 
##       Score: abond, accéder, acompañada, affamé, affiché, âgés, agrad 
## Topic 4 Top Words:
##       Highest Prob: store, shop, price, car, look, need, get 
##       FREX: store, buy, hair, massag, cloth, brand, groceri 
##       Lift: abarth, acura, alamo, alpaca, ambush, ammunit, anklet 
##       Score: agaç, amoureux, angenehmeren, angeordnet, ärgerlich, aubain, aufgeteilt 
## Topic 5 Top Words:
##       Highest Prob: pizza, dessert, cream, steak, salad, chocol, ice 
##       FREX: pasta, crust, pie, cupcak, oliv, filet, scallop 
##       Lift: balsam, clam, était, foi, marinara, mussel, pie 
##       Score: abimé, abond, abondant, accéder, accompagn, accompagné, accueilli 
## Topic 6 Top Words:
##       Highest Prob: wait, ask, back, got, order, time, get 
##       FREX: rude, apolog, horribl, worst, manag, said, upset 
##       Lift: jetblu, paramed, rude, upset, aaaaaalright, aahhh, abajo 
##       Score: abimé, abond, accéder, accompagn, accompagné, accueilli, accueillir 
## Topic 7 Top Words:
##       Highest Prob: best, amaz, also, vega, love, well, favorit 
##       FREX: buffet, dim, varieti, yogurt, pricey, smoothi, froyo 
##       Lift: blini, blynk, bsbc, cfu, dlite, ducass, fatfre 
##       Score: abimé, abond, abondant, accéder, accompagn, accompagné, accueilli 
## Topic 8 Top Words:
##       Highest Prob: work, time, year, will, staff, day, help 
##       FREX: offic, doctor, pet, vet, yoga, dentist, instructor 
##       Lift: abdomin, abdul, accutemp, achill, acp, administ, adopt 
##       Score: abond, accéder, accompagn, accompagné, accueillir, achet, achèt 
## Topic 9 Top Words:
##       Highest Prob: order, chicken, fri, burger, sandwich, sauc, flavor 
##       FREX: chicken, fri, burger, taco, egg, pork, bbq 
##       Lift: abodaba, adobo, alfalfa, atkin, authentico, bab, bap 
##       Score: abimé, abond, abondant, accéder, accompagn, accompagné, accueilli 
## Topic 10 Top Words:
##       Highest Prob: food, good, place, servic, time, tri, restaur 
##       FREX: sushi, ayc, boba, roll, thai, food, eel 
##       Lift: amaebi, cartman, chanpen, chinees, cyclo, eew, filipina 
##       Score: abimé, abond, abondant, accéder, accompagn, accompagné, accueilli
plot.STM(aModel, type='perspectives', topics=c(1, 10))

plot.STM(aModel, type='labels')

sageLabels(aModel)
## Topic 1: 
##       Marginal Highest Prob: like, just, place, get, can, one, realli 
##       Marginal FREX: donut, coffe, sometim, east, might, perhap, chain 
##       Marginal Lift: bosa, abolut, aboslut, accessbas, accommid, aeroport, afaik 
##       Marginal Score: abimé, abond, abondant, accéder, accompagn, accompagné, accueilli 
##  
##       Topic Kappa:  
##       Kappa with Baseline:  
##  
## Topic 2: 
##       Marginal Highest Prob: great, place, friend, drink, bar, love, night 
##       Marginal FREX: beer, patio, tap, brew, draft, und, irish 
##       Marginal Lift: aaaall, aan, aangenaam, abbay, abdh, abendessen, abendstunden 
##       Marginal Score: abimé, abond, abondant, accéder, accompagn, accompagné, accueilli 
##  
##       Topic Kappa:  
##       Kappa with Baseline:  
##  
## Topic 3: 
##       Marginal Highest Prob: room, stay, hotel, show, vega, get, see 
##       Marginal FREX: room, hotel, pool, club, casino, bathroom, danc 
##       Marginal Lift: luxor, aaaaaaaad, aahh, aall, aawwweeessoommeee, abba, abccom 
##       Marginal Score: abond, accéder, acompañada, affamé, affiché, âgés, agrad 
##  
##       Topic Kappa:  
##       Kappa with Baseline:  
##  
## Topic 4: 
##       Marginal Highest Prob: store, shop, price, car, look, need, get 
##       Marginal FREX: store, buy, hair, massag, cloth, brand, groceri 
##       Marginal Lift: abarth, acura, alamo, alpaca, ambush, ammunit, anklet 
##       Marginal Score: agaç, amoureux, angenehmeren, angeordnet, ärgerlich, aubain, aufgeteilt 
##  
##       Topic Kappa:  
##       Kappa with Baseline:  
##  
## Topic 5: 
##       Marginal Highest Prob: pizza, dessert, cream, steak, salad, chocol, ice 
##       Marginal FREX: pasta, crust, pie, cupcak, oliv, filet, scallop 
##       Marginal Lift: balsam, clam, était, foi, marinara, mussel, pie 
##       Marginal Score: abimé, abond, abondant, accéder, accompagn, accompagné, accueilli 
##  
##       Topic Kappa:  
##       Kappa with Baseline:  
##  
## Topic 6: 
##       Marginal Highest Prob: wait, ask, back, got, order, time, get 
##       Marginal FREX: rude, apolog, horribl, worst, manag, said, upset 
##       Marginal Lift: jetblu, paramed, rude, upset, aaaaaalright, aahhh, abajo 
##       Marginal Score: abimé, abond, accéder, accompagn, accompagné, accueilli, accueillir 
##  
##       Topic Kappa:  
##       Kappa with Baseline:  
##  
## Topic 7: 
##       Marginal Highest Prob: best, amaz, also, vega, love, well, favorit 
##       Marginal FREX: buffet, dim, varieti, yogurt, pricey, smoothi, froyo 
##       Marginal Lift: blini, blynk, bsbc, cfu, dlite, ducass, fatfre 
##       Marginal Score: abimé, abond, abondant, accéder, accompagn, accompagné, accueilli 
##  
##       Topic Kappa:  
##       Kappa with Baseline:  
##  
## Topic 8: 
##       Marginal Highest Prob: work, time, year, will, staff, day, help 
##       Marginal FREX: offic, doctor, pet, vet, yoga, dentist, instructor 
##       Marginal Lift: abdomin, abdul, accutemp, achill, acp, administ, adopt 
##       Marginal Score: abond, accéder, accompagn, accompagné, accueillir, achet, achèt 
##  
##       Topic Kappa:  
##       Kappa with Baseline:  
##  
## Topic 9: 
##       Marginal Highest Prob: order, chicken, fri, burger, sandwich, sauc, flavor 
##       Marginal FREX: chicken, fri, burger, taco, egg, pork, bbq 
##       Marginal Lift: abodaba, adobo, alfalfa, atkin, authentico, bab, bap 
##       Marginal Score: abimé, abond, abondant, accéder, accompagn, accompagné, accueilli 
##  
##       Topic Kappa:  
##       Kappa with Baseline:  
##  
## Topic 10: 
##       Marginal Highest Prob: food, good, place, servic, time, tri, restaur 
##       Marginal FREX: sushi, ayc, boba, roll, thai, food, eel 
##       Marginal Lift: amaebi, cartman, chanpen, chinees, cyclo, eew, filipina 
##       Marginal Score: abimé, abond, abondant, accéder, accompagn, accompagné, accueilli 
##  
##       Topic Kappa:  
##       Kappa with Baseline:  
## 

-[ ] List of every category used, so we can pull business ids. -[ ] how to get records that match on a list of ids?

Other Database

This post has a really good example of how to use dplyr with RSQLite.



Discovering MongoDB, together

We continue to gain traction on the Yelp Challenge data this week, exploring how to use MongoDB to easily import the massive JSON data set. This should ultimately culminate into a longer post, but for now check out this guide over at Joy of Data.

If you need any help in R, please drop by (with minor questions) or submit a request (for small, medium, and large questions).

Use your R skills for good, earn cash

The time commitment is doable — 7 hours per week — and the pay, at $25/hr, is great. There’s also the opportunity to author 2 publications and possibly even help humanity in some small way while you’re at it. Check out the job posting over at the College of Education. If you’re lucky enough to land the job, you’ll be working with Joe Nese, who told us he’s happy to field any of your questions about the project.

Job opportunity for R Clubbers + show and tell

Joe Nese, a Research Assistant Professor with Behavioral Research and Teaching at the College of Education will visit to tell us about a really fantastic job opportunity for R Club-type grad students. If you like stats, and extra cash, definitely don’t miss it. Here’s the job posting.

Also, remember to bring examples of R workflows from the wild so we can compare all the messy yet fantastic ways we use R on a daily basis.

Wintertime R Fun!

R Club continues this quarter in the basement of Staub, room 006, Tuesday 3:00-4:20p. This quarter we’ll return to the workshop/consultation format. If you have a problem you want to workshop at R Club, or something cool you want to share, fill out this form.

Here’s our current schedule:

Week 2 – Bring your past projects in R for show and tell
Week 3 – Creating a database for Yelp Challenge data set
Ongoing – Let’s play with this enormous data set from Yelp!

Requests

Dynamic documents!

For our meeting today, I’ll be sharing tales of my adventures in creating dynamic documents  — building R code right into my papers and presentations, so all of the figures, tables, statistics, etc. are generated automatically by the code — and tips and tricks for folks interested in trying this out.

I’ve tried two different methods for weaving my code in with my text: r-markdown, and sweave (r-latex). Both involve writing the document in plain text (with either markdown or latex formatting), with chunks of code throughout. The main difference is that markdown is MUCH easier to learn, and latex is MUCH more powerful. I’ll focus on latex during my r-club presentation since you all can probably figure out r-markdown pretty well on your own and don’t need me talking at you about it.

To whet your appetite, here are some latex resources:
http://www.tug.org/pracjourn/2008-1/zahn/zahn.pdf
http://mirror.jmu.edu/pub/CTAN/macros/latex/contrib/apa6/apa6.pdf
http://merkel.zoneo.net/Latex/natbib.php
http://yihui.name/knitr/demo/sweave/
https://support.rstudio.com/hc/en-us/articles/200552056-Using-Sweave-and-knitr

Here are my latest dynamic documents:

https://github.com/rosemm/context_word_seg

https://www.dropbox.com/s/5jm99cy17xfuetx/workshop1_slides.Rnw?dl=1