Skip to main content

Reddit for Statistics Class

I love reddit. I really love the sub-reddit r/dataisbeautiful. Various redditors contribute interesting graphs and charts from all over the interwebz.

I leave you to figure out how to use these data visualizations in class. If nothing else, they are highly interesting examples of a wide variety of different graphing techniques applicable to different sorts of data sets. In addition to interesting data visualizations, there are usually good discussions (yes, good discussion in the internet!) among redditors about what is pushing the presented findings.

Another facet of these posts are the sources of the data. There are many examples using archival data, like this chart that used social media to estimate sports franchise popularity,

Users also share interesting data from more traditional sources, like APA data on the rates of Masters/Doctorates awarded over time and user rating data generated by IMDB (here, look at the gender/age bias in ratings of the movie Fifty Shades of Grey).


Other statsy subreddits:

r/samplesize: Here, you can post your own online research projects in order to (hopefully!) up you sample size. You can also work on your own research participation karma by participating in research.

r/HITsWorthTurkingFor: Redditors share links for Amazon's mTurk tasks that are particularly profitable. There is all manner of work available here (not all of it research based), but as mTurk is increasingly popular with academic researchers, I'm adding it to this list. I think this may be a valuable discussion piece if you talk about mTurk in your research methods classes and want to discuss the variables that influence participation (and how this may effect your research) as well as whether or not mTurk is really a random sampling of humanity.

/r/statistics: A place for all of your statistics questions.

/r/dataisugly: The opposite of r/dataisbeautiful. Can be used to teach students how to create good graphs by showing them how not to create good graphs.

Popular posts from this blog

Ways to use funny meme scales in your stats classes

Have you ever heard of the theory that there are multiple people worldwide thinking about the same novel thing at the same time? It is the multiple discovery hypothesis of invention . Like, multiple great minds around the world were working on calculus at the same time. Well, I think a bunch of super-duper psychology professors were all thinking about scale memes and pedagogy at the same time. Clearly, this is just as impressive as calculus. Who were some of these great minds? 1) Dr.  Molly Metz maintains a curated list of hilarious "How you doing?" scales.  2) Dr. Esther Lindenström posted about using these scales as student check-ins. 3) I was working on a blog post about using such scales to teach the basics of variables.  So, I decided to create a post about three ways to use these scales in your stats classes:  1) Teaching the basics of variables. 2) Nominal vs. ordinal scales.  3) Daily check-in with your students.  1. Teach your students the basics...

Leo DiCaprio Romantic Age Gap Data: UPDATE

Does anyone else teach correlation and regression together at the end of the semester? Here is a treat for you: Updated data on Leonardo DiCaprio, his age, and his romantic partner's age when they started dating. A few years ago, there was a dust-up when a clever Redditor r/TrustLittleBrother realized that DiCaprio had never dated anyone over 25. I blogged about this when it happened. But the old data was from 2022. Inspired by this sleuthing,  I created a wee data set, including up-to-date information on his current relationship with Vittoria Ceretti, so your students can suss out the patterns that exist in this data.

Tyler Vigen's Spurious Correlations

Tyler Vigen has has created  a long list of easy-to-paste-into-a-powerpoint graphs that illustrate that correlation does not equal causation. For instance, while per capita consumption of cheese and number of people who die by become tangled in their bed sheets may have a strong relationship (r = 0.947091), no one is saying that cheese consumption leads to bed sheet-related death. Although, you could pose The Third Variable question to your students for some of these relationships). Property of Tyler Vigens, http://i.imgur.com/OfQYQW8.png Vigen has also provided a menu of frequently used variables (deaths by tripping, sunlight by state) to help you look for specific examples. This portion is interactive, as you and your students can generate your own graphs. Below, I generated a graph of marriage rates in Pennsylvania and consumption of high fructose corn syrup. Generated at http://www.tylervigen.com/