Skip to main content

Posts

Incorporating Hamilton: An American Musical into your stats class.

While I was attending the Teaching Institute at APS, I attended Wind Goodfriend's talk about using case studies in the classroom. Which got me thinking about fun case studies for statistics. But not, like, the classic story about Guinness Brewery and the t-test . I want case studies that feature a regular person in a regular job who used their personal expertise to deduce from data to do something great. An example popped into my head while I was walking my dog and listening to the Hamilton soundtrack: Hercules Mulligan. Okieriete Onaodowan, portraying Hercules Mulligan in Hamilton He was a spy for America during the American Revolution. He was a tailor and did a lot of work for British military officers. This gave him access to data that he shared through a spy network to infer the timing of British military operations. Here is a better summary, from the CIA:  I like this example because he wasn't George Washington. And he wasn't Alexander Hamilton. He had t...

The Good Place and Replication

NOTE: SPOILERS ABOUND. The Good Place is on NBC. And I love it.  At the heart of the show is one demon's (Michael) efforts to create a new version of hell that is only hellish because every person is already being punished by who and what they are. Right, I know. Anyway, in Season 3, Episode 11, Michael's bosses argue that this hell isn't working because it actually leads to self-improvement and fulfillment for everyone is supposed to be tormented. And the bosses argue that the self-improvement is a fluke. So one of the other characters, a philosopher named Chidi, suggests...SCIENTIFIC REPLICATION!! The whole episode is great, but here are some screenshots to get started.

Diversity in Tech by DataIsBeautiful

I am a fan of explaining the heart of a statistical analysis conceptually with words and examples, not with math. Information Is Beautiful has a gorgeous new interactive, Diversity in Tech , that uses data visualization to present gender and ethnic representation among employees at various big-name internet firms. I think this example explains why we might use Chi-Square Goodness of Fit. I think it could also be used in an I-O class. So, what this interactive gives you is a list of the main, big online firms. And then the proportions of different sort of people who fall into each category. See below: When I look at that US Population baseline information, I see a bunch of expected data. And then when I see the data for different firms, I see Observed data. So, I see a bunch of conceptual examples for chi-square Goodness of Fit. For example, look at gender. 51% of the population is female. That is you Expected data. Compare that to data for Indiegogo. They have 50% female e...

Snake Oil Superfoods by InformationIsBeautiful

In my stats classes, we discuss popular claims that have been proven/disproven by research. So, learning styles. Vitamins. One claim we dig into are the wide array of claims made about the health benefits of different foods and folk beliefs about nutrition. But how to get into it? That is such a big field, looking at different foods used for different conditions. Send your students to InformationIsBeautiful's Snake Oil Super foods , which sorted through all of good studies and created an interactive data viz to summarize. For instance, these are three foods, backed by science, for very specific issues: BUT GET THIS: If you scroll over any of them, you get a quick summary of the findings AND a link to the research article. See below for Oats. NOICE. The information isn't limited to slam dunks, either, it fleshes out promising foods and weak links as well. AND...this is great...below the visualization there is all sorts of information on their methodo...

A big metaphor for effect sizes, featuring malaria.

TL; DR- Effect size interpretation requires more than numeric interpretation of the effect size. You need to think about what would be considered a big deal, real-life change worth pursuing, given the real-world implications for your data. For example, there is a  malaria vaccine with a 30% success rate undergoing  a large scale trial in Malawi . If you consider that many other vaccines have much higher success rates, 30% seems like a relatively small "real world" impact, right? However, two million people are diagnosed with malaria every year. If science could help 30% of two million, the relatively small effect of 30% is a big deal. Hell, a 10% reduction would be wonderful. So, a small practical effect, like "just" 30%, is actually a big deal, given the issue's scale. How to use this news story: a) Interpreting effect sizes beyond Cohen's numeric recommendations. b) A primer on large-scale medical trials and their ridiculously large n-sizes and tra...

McBee's "Sampling distribution under H0 and critical values"

I think that interactive visualizations are better than lengthy, wordy text books when it comes to illustrating statistical principles. One little GIF or interactive website can do a far better job than text or words. For example: Everything that Kristoffer Magnusson has given us (effect sizes, correlations, etc.). Here is a new tool for explaining critical regions in Intro Stats. Matthew  McBee created an interactive in shinyapps that shows how critical regions change a) depending on test, b) sample size, change of the shape of the distribution. With the ol' t-test, you can show how the critical values move around with degrees of freedom What your t-test critical values looks like at df = 3... ...versus how the those critical values look at df = 80 Also, you can do the same thing but with F curves. Andplusalso: Matt has also created shiny apps to adjust p-values for multiple comparisons , AND another one for calculating p-values based on a test statistics ...

Using manly beards to explain repeated measure/within subject design, interactions.

There are a lot of lessons in this one study  (Craig, Nelson, & Dixson, 2019): Within subject design, factorial ANOVA and interactions,and data is available via OSF. Let's begin: TL: DR: The original study looked and the presence or absence of beards and whether or not this affected participants' ability to decode the emotional expression on a man's face. Or, more eloquently: TL: DR: Their stimuli were pictures of the same dudes with and without beards. And those weren't just any dudes, they had been trained in the Ekman facial coding system as to make distinct expressions. Or... One participant, rating the same man in Bearded vs. Non-bearded condition, provides a clear example of within subject research design. This article also provides examples of interactions and two-way ANOVA. Here look at aggression ratings for expressing (happy v. angry) and face hairiness (clean-shaven v. beard). Look at that bearded face interaction! Bearded guy...

Use global climate change as a conceptual introduction to multiple regression

Eric Roston and Blacki Migliozzi put together some great data visualizations illustrating different factors that may or may not contribute to global climate change ( "What's Really Warming the World?" ). I couldn't capture it in this blog post, but the data is animated and interactive as to highlight change over time. Very slick. This got me thinking about multiple regression, which studies different variables (X 1 , X 2... ) that may or may not contribute to some outcome (Y), and how we can use this website as a conceptual example of multiple regression. Here, the graph features multiple "predictor"/X 1 , X 2 , X 3 , X 4 variables (greenhouse gasses, ozone, land use, aerosols) as well as the predicted/Y variable (global temperature). we can see the aerosols are likely a very poor predictor while greenhouse gasses are likely a good predictor. This page can also be used to explain plain old linear regression. This example compares one predictor/X...

xkcd comics and statistical thinking.

Xkcd is a gift to Statisitcs instructors . Author Randall Monroe shares his humor and statistics knowledge. I think that many of his comics can be used as extra credit points , in that you don't get the joke unless you get the conceptual statistical knowledge behind the joke. NOTE: I have included images here, but you really, really should go to the original comics and cursor over for the messages to view the alternative text. NOTE TWO: This is not a comprehensive list but I will try to update it as Monroe shares more comics. To teach APA formatting: https://xkcd.com/833/ To explain sufficient sample size in research: https://xkcd.com/507/ To explain good statistics manners/how to appropriately ask for stats help: https://m.xkcd.com/2116/ To explain error bars: https://xkcd.com/2110/ T-test and the t-curve: https://xkcd.com/2110/ Linear relationships: https://xkcd.com/605/ The Normal Curve: https://xkcd.com/2118/ Cherry picking, p-...

Elizabeth Page-Gould's PSY305: Treatment of Psychological Data

Two things this week: 1) Open Science Framework can be used to share teaching materials and 2) Dr. Page-Gould shows us how to do just that, and how to do it very well. Most people who would visit this blog have heard of the Open Science Framework. You probably know that it is a popular place to share research projects/data/pre-register your jam/share materials, but did you know that it is also a popular place to share teaching resources? Dr. Page-Gould recently shared her whole stinking upper level Stats/RM class, Treatment of Psychological data . And it is beautiful and good and makes me feel like an entirely inadequate statistics instructor. Like...she shared EVERYTHING and it is beautiful and a great example of how to fold the "new statistics" into undergraduate stats. Lectures, example data, and lab resources (and rubrics for grading her labs) are available. This is an upper level course but it covers topics that should be included in Introduction to Statistics. I ...

Likelihood of Null Effects of large

This example provides evidence of data funny business beyond psychology, shows why pre-registration is a good thing, AND uses a chi-square. Bonus points for being couched in medicine and prominently featuring randomized controlled trials (RCT). Basically, Kaplan and Irving's  research checked out the results for RCTs funded with grants from the National Heart, Lung and Blood Institute. See below for how they selected their studies: And what did they find? When folks started registering their outcomes, folks started to get fewer "beneficial" results. Which probably REALLY means that some of those previous "beneficial" results were not so beneficial, or the result of some data massaging. See below: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0132382           Another reason to love this example: It is a real life chi-square that is easy to understand! I feel like I don't have enough great chi-square examples in my lif...

My Stats Snacks Pinterest Board

Y'all. I love collecting examples of awesome cookies and cakes and cupcakes beautified with statistics and data and graphs. Here is my Pinterest Board. My goal is to SOMEDAY have the time and skills to make some of my own. Until then, I will gush over other peoples accomplishments:

Taco Bell and Chi-Square. Because of course this moment was coming.

Do you know what we need as statistic instructors? A) More chi-square examples that are b) rooted in Taco Bell condiments and c) are null. So here you go, as inspired by this tweet : This data did not achieve statistical significance, X^2 (3, N = 32) = 0.33, p = 0.95. The data suggests that these Taco Bell packets are randomly distributed. If you do this analysis by hand, here is your data: Diablo = 8, Hot = 9, Fire = 7,  Mild = 9. If you do this analysis via software, here is the .csv version of the data , here is the .jasp version of the data , and here is a version of the data you can just copy and paste. Sauce Diablo Diablo Diablo Diablo Diablo Diablo Diablo Diablo Fire Fire Fire Fire Fire Fire Fire Fire Fire Hot Hot Hot Hot Hot Hot Hot Mild Mild Mild Mild Mild Mild Mild ...