Not awful and boring ideas for teaching statistics

Posts

Showing posts from 2017

Stein's, "Could probiotics protect kids from a downside of antibiotics?"

Your students have heard of probiotics. In pill form, in yogurt, and if you are a psychology major, there is even rumbling that probitotics and gut health are linked to mental health. But this is still an emerging area of research. And NPR did a news story about a clinical trial that seeks to understand how probiotics may or may not help eliminate GI problems in children who are on antibiotics . Ask any parent, and they can tell you how antibiotics, which are wonderful, can mess with a kid's belly. When they are already sick. Science is trying to provide some insight into the health benefits of probiotics in this specific situation. They spell out the methodology: How to use in class: 1) I love about this example is that the research is happening now, and very officially as an FDA clinical trial . So talk to your students about clinical trials, which I think you can then related back to why it is good to pre-register your non-FDA research, with explicit research m...

'Nowhere To Sleep': Los Angeles Sees Increase In Young Homeless

Anna Scott, reporting for NPR, described changes to the homeless census in LA . It applies to stats/RM because an improvement in survey methodology lead to a big change in the city's estimation of number of homeless young adults. I also think this is also a good piece for teaching because the story keeps coming back to Japheth Greg Dyer, a homeless college student who aged out of the foster care and was sort of tossed into the world on his own. Straight from NPR: Homelessness hasn't necessarily increased dramatically. Instead, these findings seem to indicate that they finally have a reliable way to count young adult homelessness due to a better understanding of young adults. The dramatic increase is methodological.

Roeder's What If God Were A Giant Game Of Plinko?

Roeder, writing for fivethirtyeight.com, has come up with a new way to illustrate the Central Limit Theorem. And it uses Plinko, the beloved The Price is Right game! http://www.businessinsider.com/price-is-right-contestant-plinko-record-2017-5 Well, a variation upon Plinko, featured on the NBC game show The Wall. Their Plinko is much larger and more dramatic and the slots at the bottom go up to $1 million. See below. http://selenahughes.com/1553-2/ How does CLT come into play? Well, the ball is randomly thrown down The Wall. And people jump around and hope for certain outcomes. But what outcome is most likely over time? For the pattern of ending positions to conform to the normal curve. Which it did. See below. https://fivethirtyeight.com/features/what-if-god-were-a-giant-game-of-plinko/ The article itself gets pretty spiritual as the author starts talking about randomness, of the show and of life. You can steal, but cite, the game show as example of CLT as he ...

Math With Bad Drawing's "Why Not to Trust Statistics"

Bad Math with Drawings has graced us with statistical funnies before (scroll down for the causality coefficient ). Here is another one, a quick guide pointing out how easy it is to lie with descriptive statistics. Here are two of the examples, there are plenty more at Math With Bad Drawings. https://mathwithbaddrawings.com/2016/07/13/why-not-to-trust-statistics/ https://mathwithbaddrawings.com/2016/07/13/why-not-to-trust-statistics/

Using The Onion to teach t-tests

In the past, I've used fake data based on real research to create stats class examples. Baby names, NICUs, and paired t-test . Pain, surgical recovery, and ANOVA . Today, I've decided to use fake data and fake research to create a real example for teaching one-sample t-test. It uses this research report from The Onion: https://www.theonion.com/toddler-scientists-finally-determine-number-of-peas-tha-1820347088 In this press release, the baby scientists claim that the belief that a baby could only smash four peas into their ear canal were false. Based upon new research recommendations, that number has been revised to six. Which sure sounds like a one-sample t-test to me. Four is the mu assumed true based upon previous baby ear research. And the sample data had a mean of 6, and this was statistically significant. Here is some dummy data that I created that replicates these findings, when mu/test value is set to 4. : 5.00 6.00 7.00 6.00 5.00 6.00 6.00 5.0...

Yau's Real Chart Rules to Follow

The sum of the parts is greater than the whole? Nathan Yau's article on creating readable, useful graphs is a perfectly reasonable list of how to create a proper graph. The content is sound. So, that's good. However, the accompanying images and captions are hilarious. They will show your students how to make not-awful charts.

Wilson's "Why Are There So Many Conflicting Numbers on Mass Shootings?"

This example gets students thinking about how we operationalize variables. Psychologists operationalize a lot of abstract stuff. Intelligence. Grit. But what about something that seems more firmly grounded and countable, like whether or not a crime meets the criteria for a a mass shooting? How do we define mass shooting? As shared in this article by Chris Wilson for Time Magazine , the official definition is 1) three or more people 2) killed in a public setting. That is per the current federal definition of a mass shooting . But that isn't universally excepted by media outlets. The article shares different metrics used for identifying a mass shooting, depending on what source is being used. Whether or not to include a dead shooter towards the total number killed. Whether or not the victims were randomly selected. I think the most glaring example from the article has to do with the difference that this definition makes on mass shooting counts: You could also discuss wi...

Logical Fallacy Ref Meme

So, I love me some good statsy memes. They make a brief, important point that sticks in the heads of students. I've recently learned of the Logical Fallacy Ref meme. Here are a couple that apply to stats class:

Climate Central's The First Frost is Coming Later

So, this checks off a couple of my favorite requisites for a good teaching example: You can personalize it, it is contemporary and applicable, it illustrates a few different sorts of statistics. Climate Central wrote this article about first frost dates, and how those dates, and an increasing number of frost-free days, create longer growing seasons. The overall article is about how frosty the US is becoming as the Earth warms. They provide data about the first frost in a number of US cities. It even lists my childhood hometown of Altoona, PA, so I think there is a pretty large selection of cities to choose from. Below, I've included the screen grab for my current home, and the home of Gannon University, Erie, PA. The first frost date is illustrated with a line chart, but the chart also includes the regression line. Data for frosty, chilly Erie, PA The article also presents a chart that shows how frost is related to the length of the growing season in t...

Using the Global Terrorism Database's code book to teach levels of measurement, variable types

A database codebook is the documentation of all of the data entry rules and coding schemes used in a given database. And code books usually contain examples of every kind of variable and level of measurement you need to teach your students during the first two weeks of Intro Stats. You can use any code book from any database relevant to your own scholarship as an example in class. Or perhaps you can find a code book particularly relevant to the students or majors you are teaching. Here, I will describe how to use Global Terrorism Database ’s code book for this purpose. The Global Terrorism Database is housed at the University of Maryland and has been tracking national and international terrorism since 1970 and has collected information on over 170, 000 attacks. So, the database in and of itself could be useful in class. But, I will focus on just the code book for now, as I think this example cuts across disciplines and interests as all of our students are aware of terroris...

Compound Interest's "A Rought Guide to Spotting Bad Science"

I love good graphic design and lists. This guide to spotting bad science embraces both. And many of the science of bad science are statistical in nature, or involve sketchy methods. Honestly, this could be easily turned into a homework assignment for research evaluation. This comes from the Compound Interest ( @compoundchem ), which has all sorts of beautiful visualizations of chemistry topics, if that is your jam.

Izadi's "Black Lives Matter and America’s long history of resisting civil rights protesters"

Elahe Izadi, writing for The Washington Post, shared polling data from the 1960s. The data focused on public opinion about different aspects of the civil rights movement (March on Washington, freedom riders, etc.). The old data was used to draw parallels between the mixed support for the Civil Rights Movement of the 1960s and the mixed support for current civil rights protests, specifically, Black Lives Matter. Here is the Washington Post story on the polling data, the civil rights movement, and Black Lives Matter. The story is the source of all the visualizations contained below. H ere is the original polling data . https://img.washingtonpost.com/wp-apps/imrs.php?src=https://img.washingtonpost.com/blogs/the-fix/files/2016/04/2300-galluppoll1961-1024x983.jpg&w=1484 https://img.washingtonpost.com/wp-apps/imrs.php?src=https://img.washingtonpost.com/blogs/the-fix/files/2016/04/2300-galluppoll1963-1024x528.jpg&w=1484 I think this is timely data. And...

Yau's "Divorce and Occupation"

Nathan Yau , writing for Flowing Data , provides a good example of correlation, median, and correlation not equaling causation in his story, " Divorce and Occupation ". Yau looked at the relationship between occupation and divorce in a few ways. He used one of variation upon the violin plot to illustrate how each occupation's divorce rate falls around the median divorce rate. Who has the lowest rate? Actuaries. They really do know how to mitigate risk. You could also discuss why median divorce rate is provided instead of mean divorce rate. Again, the actuaries deserve attention as they probably would throw off the mean. https://flowingdata.com/2017/07/25/divorce-and-occupation/ He also looked at how salary was related to divorce, and this can be used as a good example of a linear relationship: The more money you make, the lower your chances for divorce. And an intuitive exception to that trend? Clergy members. https://flowingdata.com/2017/07/25/divorce...

Teach t-tests via "Waiting to pick your baby's name raises the risk for medical mistakes"

So, I am very pro-science, but I have a soft spot in my heart for medical research that improves medical outcomes without actually requiring medicine, expensive interventions, etc. And after spending a week in the NICU with my youngest, I'm doubly fond of a way of helping the littlest and most vulnerable among us. One example of such was published in the journal Pediatrics and written up by NPR . In this case, they found that fewer mistakes are made when not-yet-named NICU babies are given more distinct rather than less distinct temporary names. The unnamed baby issues is an issue in the NICU, as babies can be born very early or under challenging circumstances, and the babies' parents aren't ready to name their kids yet. Traditionally, hospitals would use the naming convention "BabyBoy Hartnett" but several started using "JessicasBoy Hartnett" as part of this intervention. So, distinct first and last names instead of just last names. They measured patie...

Hedonometer.org

The Hedonometer measures the overall happiness of Tweets on Twitter. It provides a simple, engaging example for Intro Stats since the data is graphed over time, color-coded for the day of the week, and interactive. I think it could also be a much deeper example for a Research Methods class as the " About " section of the website reads like a journal article methods section, in so much that the Hedonometer creators describe their entire process for rating Tweets. This is what the basic table looks like. You can drill into the data by picking a year or a day of the week to highlight. You can also use the sliding scale along the bottom to specify a time period. The website is also kept very, very up to date, so it is also a very topical resource. Data for white supremacy attack in VA In the pages "About" section, they address many methodological questions your students might raise about this tool. It is a good example for the process researchers go ...

Sonnad and Collin's "10,000 words ranked according to their Trumpiness"

I finally have an example of Spearman's rank correlation to share. This is a political example, looking at how Twitter language usage differs in US counties based upon the proportion of votes that Trump received. This example was created by Jack Grieves , a linguist who uses archival Twitter data to study how we speak. Previously, I blogged about his work that analyzed what kind of obscenities are used in different zip codes in the US . And he created maps of his findings, and the maps are color coded by the z-score for frequency of each word. So, z-score example. Southerners really like to say "damn". On Twitter, at least. But on to the Spearman's example. More recently, he conducted a similar analysis, this time looking for trends in word usage based on the proportion of votes Trump received in each county in the US. NOTE: The screen shots below don't do justice to the interactive graph. You can cursor over any dot to view the word as well as the cor...

The Economists' "Ride-hailing apps may help to curb drunk driving"

I think this is a good first day of class example. It shows how data can make a powerful argument, that argument can be persuasively illustrated via data visualization, AND, maybe, it is a soft sell of a way to keep your students from drunk driving. It also touches on issues of public health, criminal justice, and health psychology. This article from The Economist succinctly illustrates the decrease in drunk driving incidents over time using graphs. This article is based on a working paper by PhD student Jessica Lynn (name twin!) Peck. https://cdn.static-economist.com/sites/default/files/imagecache/640-width/20170408_WOC328_2.png Also, maybe your students could brainstorm third variables that could explain the change. Also, New Yorkers: What's the deal with Staten Island? Did they outlaw Uber? Love drunk driving?

Kim Kardashinan-West, Buzzfeed, and Validity

So, I recently shared a post detailing how to use the Cha-Cha Slide in your Intro Stats class. Today? Today, I will provide you with an example of how to use Kim Kardashian to explain test validity. So. Kim Kardashian-West stumbled upon a Buzzfeed quiz that will determine if you are more of a Kim Kardashian-West or more of a Chrissy Teigen . She Tweeted about it, see below. https://twitter.com/KimKardashian/status/887881898805952514 And she went and took the test, BUT SHE DIDN'T SCORE AS A KIM!! SHE SCORED AS A CHRISSY! See below. https://twitter.com/KimKardashian/status/887882791488061441 So, this test purports to assess one's Kim Kardashian-West-ness or one's Chrissy Teigan-ness. And it failed to measure what it claimed to measure as Kim didn't score as a Kim. So, not a valid measure. No word on how Chrissy scored. And if you are in you teach people in their 30s, you could always use this example of the time Garbage's Shirley Manson...

Hickey's "The Ultimate Playlist Of Banned Wedding Songs"

I think this blog just peaked. Why? I'm giving you a way to use the Cha-Cha-Slide ("Everybody clap your hands!") as a tool to teach basic descriptive statistics. Here is a list of the most frequently banned-from-wedding songs: Most Intro Stats teachers could use this within the first week of class, to describe rank order data, interval data, qualitative data, quantitative data, the author's choice of percentage frequency data instead of straight frequency. Additionally, Hickey, writing for fivethirtyeight , surveyed two dozen wedding DJs about banned songs at 200 weddings. So, you can chat about research methodology as well. Finally, as a Pennsylvanian, it makes me so sad that people ban the Chicken Dance! How can you possibly dislike the Chicken Dance enough to ban it? Is this a class thing?

de Frieze's "‘Replication grants’ will allow researchers to repeat nine influential studies that still raise questions"

In my stats classes, we talk about the replication crisis. When introducing the topic, I use this reading from NOBA . I think it is also important for my students to think about how science could create an environment where replication is more valued. And the Dutch Organization for Scientific Research has come up with a solution: It is providing grants to nine groups to either 1) replicate famous findings or 2) reanalyze famous findings. This piece from Science details their effort s. The Dutch Organization for Scientific Research provides more details on the grant recipients , which include several researchers replicating psychology findings: How to use in class: Again, talk about the replication crisis. Ask you students to generate ways to make replication more valued. Then, give them a bit of faith in psychology/science by sharing this information on how science is on it. From a broader view, this could introduce the idea of grants to your undergraduates or get yo...

Harris's "Scientists Are Not So Hot At Predicting Which Cancer Studies Will Succeed"

This NPR story is about reproducibility in science that ISN'T psychology, the limitations of expert intuition, and the story is a summary of a recent research article from PLOS Biology (so open science that isn't psychology, too!). Thrust of the story: Cancer researchers may be having a similar problem to psychologists in terms of replication. I've blogged this issue before. In particular, concerns with replication in cancer research, possibly due to the variability with which lab rats are housed and fed . So, this story is about a study in which 200 cancer researchers, post-docs, and graduate students took a look at six pre-registered cancer stud y replications and guessed which studies would successfully replicate. And the participants systematically overestimated the likelihood of replication. However, researchers with high h-indices, were more accurate that the general sample. I wonder if the high h-indicies uncover super-experts or super-researchers who have be...

Domonoske's "50 Years Ago, Sugar Industry Quietly Paid Scientists To Point Blame At Fat"

This NPR story discusses research detective work published JAMA . The JAMA article looked at a very influential NEJM review article that investigated the link between diet and Coronary Heart Disease. Specifically, whether sugar or fat contribute more to CHD. The article, written by Harvard researchers decades ago, pinned CHD on fatty diets. But the researchers took money from Big Sugar (which sounds like...a drag queen or CB handle) and communicated with Big Sugar while writing the review article. This piece discusses how conflict of interest shaped food research and our beliefs about the causes of CHD for decades. And how conflict of interest and institutional/journal prestige shaped this narrative. It also touches on how industry, namely sugar interests, discounted research that finds a sugar:CHD link while promoting and funding research that finds a fat:CHD link. How to use in a Research Methods class: -Conflict of interest. The funding received by the researchers from th...

Chris Wilson's "The Ultimate Harry Potter Quiz: Find Out Which House You Truly Belong In"

Full disclosure: I have no chill when it comes to Harry Potter. Despite my great bias, I still think this pscyometrically-created (with help from psychologists and Time Magazine's Chris Wilson!) Hogwart's House Sorter is a great example for scale building, validity, descriptive statistics, electronic consent, etc. for stats and research methods. How to use in a Research Methods class: 1) The article details how the test drew upon the Big Five inventory. And it talks smack about the Myers-Briggs. 2) The article also uses simple language to give a rough sketch of how they used statistics to pair you with your house. The "standard statistical model" is a regression line, the "affinity for each House is measured independently", etc. While you are taking the quiz itself, there are some RM/statsy lessons: 3) At the end of the quiz, you are asked to contribute some more information. It is a great example of a leading response options ...