Monday, May 25, 2015

Scott Janish's "Relationship of ABV to Beer Scores"

Scott Janish loves beer and statistics and blogging (a man after my own heart). His blog discusses home brewing as well as data related to beer. One of his statsy blog posts took look at the relationship between average alcohol by volume for a style of beer (below, on the x-axis) and the average rating (from, y-axis). He found, perhaps intuitively, that there is a positive correlation between the average Beer Style review for a type of beer and the average alcohol content for that type of beer. Scott was kind enough to provide us with his data set, turning this in to a most teachable moment.
How to use in class:
1) Scott provides his data. The r is .418, which isn't mightily impressive. However, I think you could teach your students a  about influential observations/outliers in regression/correlation by asking them to return to the original data, eliminate the 9 data points that are inconsistent with the larger pattern, and reanalyze the data to see the effect on r/p. Heck, just remove one or two inconsistent data points and let your students see what that does to the data.
2) Linear relationships. Correlations. Regressions. Generate an experiment to test the assumption that beer snobs just really like getting drunk (and, hence, this relationship).
3) For more beer figures, see here.
4) Take a look at that sample size (see at top of the figure). How does this make the data more reliable?

Why, no, I'm not above pandering to undergraduates.