Skip to main content

Posts

History of Data Science's Regression Game

 There are already some pretty cool games for guessing linear relationships/regression lines. Dr. Hill's Eyeball Regression Game . The old, reliable Guess the Correlation game. However, I found a new one that has a particularly gorgeous interface, and a few extra features to help your learners. History of Data Science created the Regression game . It provides the player with a scatter plot, then the player needs to guess the y-intercept and slope. See that regression line? It is generated and changes as the entered a and b values change, which is a good learning tool. If played at the "easy-peasy" level, the player can even change those numbers multiple times over the course of 30 seconds, and watch as the corresponding line changes.  I think this game is a nice way to break up the ol' regression lecture and allows students to see the relationship between the scatter plot and the regression line.

Stats nerd gift list

This isn't a post full of teaching resources. Instead, it is a post of gifts and treats for stats nerds. Who might also teach stats, this still falls under the purview of this blog. Bonus points because many of these suggestions put money into creators' pockets. Statsy Etsy Shops NausicaaDistribution Etsy shop NausicaaDistribution is a great shop on Etsy . I own multiple products, including the ABC's of Statistics Poster shown below. It is beautiful and framed in my office.  The Chemist Tree Etsy shop Another Etsy maker I like is  TheChemistTree.  I have a set of the coasters, and they've held up well.  https://www.etsy.com/shop/TheChemistTree?ref=simple-shop-header-name&listing_id=501955501&search_query=statistics Chelsea Parlett Design Etsy Shop Stats expert Chelsea shares her stats knowledge on Twitter and on Etsy , via her stickers.  https://www.etsy.com/shop/ChelseaParlettDesign DataSwagCo is a newer shop with some funny, punny stats goods. https:/...

Dirty Data: Share the data in a way that is functionally inaccessible

In my intro stats class, we discuss shady data practices that aren't lying because they report actual numbers. But they are still shady because good data is presented in such a way as to be misleading or confusing. These topics include: Truncating the y-axis   Collecting measures of central tendency under ideal circumstances Manipulate online ratings (I didn't write the blog post about this yet, but it is coming). Relative vs. Absolute Risk AND HERE IS ANOTHER ONE: Insurance companies were asked to provide price data  RE: the Transparency in Coverage Rule in the Consolidatedated Appropriations Act of 2021. Google that if you want to know more about that, I'm not going into that. Not my lane. That said, it is an appealing idea. Let's have some transparency in our jacked-up healthcare system. And the insurance companies provided the data, but in a way inaccessible to most people. Like, all people, maybe? Because they just splurted out 100 TB of data. So, they totally com...

Assessing an intervention: A quick exercise for your classes, specialized to your own university.

 Here is a quick RM review I created for my Psych Stats students. We were preparing for the first exam, which covered the very basics of research methodology, including IVs and DVs. We also talk about data visualizations and how they can be used to quickly convey information.  California is dealing with an energy crisis and a heatwave. California tried a relatively inexpensive intervention to reduce the likelihood of overwhelming the energy grid: Sending out text messages during extremely high energy usage. See:   https://www.bloomberg.com/news/articles/2022-09-07/a-text-alert-may-have-saved-california-from-power-blackouts And what happened? People reduced their electric usage. Source: https://www.bloomberg.com/news/articles/2022-09-07/a-text-alert-may-have-saved-california-from-power-blackouts For the class review, I asked my students to think of the emergency alerts they receive from their university via our campus safety app. I challenged them to think of a c...

Missing data leads to conspiracy theory

This is a funny, small example for anyone who discusses managing missing data in a database. This example also touches on what can go wrong when using someone else's data or you merge datasets. So, this piece of information made the rounds in August: This isn't a lie. The voter rolls in Racine had over 20,000 voters with the same phone number. Which led to measured responses from voting rights experts on Twitter. Redhibiscus was so close to the truth! I assure you, if you have ever dealt with complicated databases, especially those that have been merged and go back decades, it isn't unusual to fill in missing data with a specific number repeatedly. Here is a fact check from the A.P. : https://apnews.com/article/fact-checking-612360682016?utm_source=Twitter&utm_campaign=SocialFlow&utm_medium=APFactCheck This isn't a big lesson for a statistics class, but it is a funny and horrifying example of how database management practices fueled a conspiracy theory. It is al...

Bella the Waitress: A fun hypothesis testing example.

Waitress Bella is on TikTok . She shares her beach looks and hauls, like plenty of other influencers. Recently, though, shared a series of TikToks that have a home in our statistics and research methods classes.  Bella had a hypothesis. She suspected that certain hairstyles influenced her customers to tip her more. So Bella tested her hypothesis over a series of within-subject, n = 1 experiments at work ( Bella, 2022a , Bella, 2022b , Bella, 2022c ) This isn't a pre-registered paper with open data, but I think this could be a good discussion piece in a research methods or statistics class. I swear that Kate isn't my burner account. If you really, really wanted to test this hypothesis properly, what would that research look like? 1) What external factors influence tips (day of the week, time of day, etc.)? 2) What factors influence reactions to waitstaff (gender, attractiveness, alcohol)? 3) Would you use a within or between research design to study this (different waitstaff wit...

How to investigate click-bait survey claims

Michael Hobbes shared a Tweet from Nick Gillespie. That Tweet was about an essay from The Bulwark . That Tweet plays fast and loose with Likert-type scale interpretation. The way Hobbes and his Twitter followers break down the issues with this headline provides a lesson on how to examine suspicious research clickbait that doesn't pass the sniff test. First off, who says "close to one in four"? And why are they evoking the attempt on Salman Rushdie's life, which did not happen on a college campus and is unrelated to high-profile campus protests of controversial speakers?  Hobbes dug into the survey cited in the Bulwark piece. The author of the Bulwark piece interpreted the data by collapsing across response options on a Likert-type response scale. Which can be done responsibly, I think. "Very satisfied" and "satisfied" are both happy customers, right? But this is suspicious. Other Twitter users questioned the question and how it may leave room for i...