Monday, September 15, 2014

minimaxir's "Distribution of Yelp ratings for businesses, by business category"

Yelp distribution visualization, posted by redditor minimaxir

This data distribution example comes from the subreddit r/dataisbeautiful (more on what a reddit is here). This specific posting (started by minimaxir) was prompted by several histograms illustrating customer ratings for various Yelp (customer review website) business categories as well as the lively reddit discussion in which users attempt to explain why different categories of services have such different distribution shapes and means.

At a basic level, you can use this data to illustrate skew, histograms, and normal distribution. As a more advanced critical thinking activity, you could challenge your students to think of reasons that some data, like auto repair, is skewed. From a psychometric or industrial/organizational psychology perspective, you could describe how customers use rating scales and whether or not people really understand what average is when providing customer feedback.

