Scott Janish loves beer, statistics, and blogging (a man after my own heart). His blog discusses home brewing as well as data related to beer. One of his statsy blog posts looked at the relationship between average alcohol by volume for a beer style (below, on the x-axis) and the average rating (from beeradvocate.com, y-axis). He found, perhaps intuitively, a positive correlation between the average Beer Style review for a type of beer and the moderate alcohol content for that type of beer. Scott was kind enough to provide us with his data set, turning this into a most teachable moment.
How to use it in class:
1) Scott provides his data. The r is .418, which isn't mighty impressive. However, you could teach your students about influential observations/outliers in regression/correlation by asking them to return to the original data, eliminate the 9 data points inconsistent with the larger pattern, and reanalyze the data to see the effect on r/p. Just remove one or two conflicting data points and let your students see what that does to the data.
2) Linear relationships. Correlations. Regressions. Generate an experiment to test the assumption that beer snobs just like getting drunk (and, hence, this relationship).
3) For more beer figures, see here.
4) Look at that sample size (see the top of the figure). How does this make the data more reliable?
Why, no, I'm not above pandering to undergraduates.
http://scottjanish.com/relationship-of-abv-to-beer-scores/ |
1) Scott provides his data. The r is .418, which isn't mighty impressive. However, you could teach your students about influential observations/outliers in regression/correlation by asking them to return to the original data, eliminate the 9 data points inconsistent with the larger pattern, and reanalyze the data to see the effect on r/p. Just remove one or two conflicting data points and let your students see what that does to the data.
2) Linear relationships. Correlations. Regressions. Generate an experiment to test the assumption that beer snobs just like getting drunk (and, hence, this relationship).
3) For more beer figures, see here.
4) Look at that sample size (see the top of the figure). How does this make the data more reliable?
Why, no, I'm not above pandering to undergraduates.
4.23.24 update: Here is the actual data in .CSV format.
Comments
Post a Comment