Skip to main content

Posts

Absolute vs. relative risk reporting: Lake effect snow edition

I maintain that relative versus absolute risk is a concept that we absolutely must teach in intro stats. I have given some examples of this before ( murder ! COVID !) but here is another one that hits home for my fellow Great Lakers. In particular, this one is for my friends in Detroit, Toledo, Cleveland, and Buffalo  Up here in Erie, a common point of discussion is how frozen Lake Erie is. Because once it freezes, the lake's moisture no longer feeds Dread Lake Effect Snow.   I like this example because you can easily perform the math in front of your class, demonstrating that 26.14% is 103.45% of 12.85%. At the same time, you have the visual to demonstrate that the vast majority of the lake was still unfrozen even with a 103.45% increase. 

Full Discussion Board Idea #1: Repurposing gently-used, second-hand data during times of crisis

I can't be the only one teaching online statistics this Spring. Last fall, I refreshed ALL of my discussion boards for my online version of Psychological Statistics. I haven't done so since 2020, and my students responded well to my new discussion topics, all of which are centered in statistical literacy and improving problem-solving with data. My first one is based on this old blog post about how residents of Houston used a Whataburger location map to figure out which parts of Houston were without electricity following Hurricane Meryl. Here is how I presented it to my students: You never know where valuable data visualizations will come from! For instance, following Hurricane Beryl, Texans used the Whataburger app to track power outages across Houston. Whataburger is a popular restaurant chain in the South. Its app has a feature where users can quickly see open or closed locations. Normally, this is used by hungry people to find the closest, open location. HOWEVER: There are S...

Data can be equity: Merging of Major League Baseball and Negro League Baseball data.

I know it is January 2025, but I want to write about something that happened during the Spring of 2024. I think it is a story about how it is never too late to do the right thing, making it great thing to think about here at the New Year. Data can't undo the past, but the way we manage it moving forward can provide the opportunity for some measure of equity. Back in May, professional baseball decided to include Negro League (NL), which existed from 2910 to 1948, baseball stats as part of Major League Baseball (MLB) stats. This is was done to allow for proper recognition of talented ML players. This changed some storied records for the league: https://www.mlb.com/news/stats-leaderboard-changes-negro-leagues-mlb This was a lot more than merging a couple of spreadsheets. As such, this story also serves as a lesson in data management and making desperate datasets the same. One that is a lot more moving than your typical story of data-cleaning. The following screenshots are from:  ...

Truncated Y-axis, but with female celebrities.

Why did I find this after my textbook was published? Damn it. I have a whole section about how Y-axis manipulation can make small differences look huge and then...I find this. Damn it. Source:   https://www.reddit.com/r/dataisugly/comments/1hjr01o/height_of_female_popstars/

Modal religions by county in the U.S.

I love my more elaborate examples, but this is a short, sweet, and interesting way to refresh your measures of central tendency lecture when you explain mode. I present you with the modal religions in each U.S. county: Found on Reddit:   https://www.reddit.com/r/dataisbeautiful/comments/1hejglm/most_common_religion_in_every_us_county_oc/ Source of Original Data: https://www.thearda.com/?gad_source=1&gclid=CjwKCAiA9vS6BhA9EiwAJpnXw7IpjxFvuiS3UvLycZrZ2ggtEzS2JDR-ow0mksK-9rD06G8Lgq6mlhoC1nwQAvD_BwE

An interactive that gets your students thinking about medians, percentiles, and their own sleeping habits.

My students struggle with sleeping and are distracted by electronics. This interactive activity allows them to think about their sleep relative to norms regarding age and sex. It also dives deeply into how sleep changes over a person's lifespan, which is a topic suitable for non-static classes like Health or Developmental.   https://www.washingtonpost.com/wellness/interactive/2024/sleep-data-survey-americans/ *You need a WaPo subscription or paywall buster to get to this interactive. Like this one! https://www.removepaywall.com/search?url=https://www.washingtonpost.com/wellness/interactive/2024/sleep-data-survey-americans/ Here is a quick interactive that a) lets your students see how well they sleep, in comparison to their demographic and b) think about median data and percentile data.  1. Repursped, gently used data is really everywhere. This interactive uses data from the Census Bureau. Which is a way to measure sleep, but not the only way. 2. Median and percentil...

Uncrustables consumption rates by NFL teams 1) do not vary by league, 2) do not correlate with 2023 wins

Many thanks to Dr. Sara Appleby for sharing this data with me!! I really enjoy silly data, like this  one from Jayson Jenks, writing for  The Athletic,  which shows how many Uncrustables each team eats per  week. Well, data from the teams that elected to participate and/or didn't make their own PB and Js. The whole article is fun, so give it a read. It makes sense that hungry athletes would go for a quick, calorie-dense, nostalgic snack containing protein.  Here is the data visualization:  Damn, Denver.  I entered this data into a spreadsheet for all of us. Spoiler alert: The number of Uncrustables eaten per week does not vary by league (independent t -test example), and the number of wins in 2023 does not correlate with the number of Uncrustables eaten per week in 2023 (correlation/regression example). Also, for my own curiosity, I re-ran the data after deleting Denver, and it wasn't enough of a difference to achieve significance.  

Subways! Murder! Absolute vs. relative risk!

When I teach the basics of probability in Intro Stats, I always emphasize absolute vs. risk. I am delighted to have a brand new example. Thanks to Dr. Sy Islam for sending it my way. Here is the headline  from The New York Post: So, one murder is too many murders. A 60% increase feels very scary. Because relative risk is always the scary risk.  Since this reporting concerns murder, the reporting itself should be serious, right? Well, what were the absolute values for subway murders? I mean, The New York Post would never, ever want to instill fear in people, right? Well, despite this headline, The New York Post, much to its credit, did include the actual data in the actual article: Eight murders, versus five in the previous years. Which is too terrible, but not nearly as frightening as a (checks notes) SOARING 60% increase. Anyway. Ta-da! Use this in your class. 

Turn your data into a GIF with Google

Google will let you make a GIF of your data. I made this GIF using YouGov data. So far, it lets you make four different kinds of GIFs. This is a small tool, but it is an excellent alternative/supplement when you are teaching students how to present their data in a PowerPoint. It isn't very show, which is the point, I think. You wouldn't want to be distracted from your data, but it adds some motion. ALSO: I personally love GIFS. 

Paris Olympics 2024: I'm here for the dank memes

 

Whataburger Index: Operationalizing power outages in hurricane ravaged Texas.

As a stats nerd, I love it when clever people make lives easier by finding clever, easy, indirect ways to estimate the thing they want to measure. As a statistics instructor, I find such examples engaging, as they encourage students to think critically and nurture their statistical literacy.  Like the Waffle Shop index. TL;DR: During weather emergencies, the federal government tracks whether or not Waffle Shops are open as a proxy for the severity of damage in a community. Waffle Shops are tough as hell, and if they close, a community needs help.  Below is a map of Waffle Houses. https://www.scrapehero.com/store/wp-content/uploads/maps/Waffle_House_USA.png Due to Hurricane Beryl, the people of Houston, Texas discovered an even more accurate measure the severity of electricity outages: The Whattaburger Index:   https://www.facebook.com/photo/?fbid=8242206945824619&set=gm.2698315720337038&idorvanity=1416658058502817 Certainly, Waffle House exists in Texas. 126...

Predictions are only as good as the regularity of the event

Weather prediction is data. This makes weather data-related stories and examples highly relatable. The Washington Post published an interactive article t hat shows how accurate weather predictions are for a given city in the United States. This means that we, stats instructors, can use this page to provide a geographically personalized lesson on weather prediction, the limitations of data, and why predictions about the future are only as good as the consistency of the past. I also like this example because it isn't terribly mathy and encourages statistical literacy.  Kommenda and Stevens, writing for the Washington Post, recently shared a story on the accuracy of weather predictions based on time away from the target day. Here, the DV is prediction accuracy, operationalized using the difference between predicted and actual high temperature. You could always ask your students how they would operationalize weather...or maybe some weather matters more than others? Folks in Erie...

Statistical thinking: What data would you need to collect to disprove the predictive power of astrological signs?

Okay. I haven't used this in class yet because it is July, and I just found it. However, I will open the Fall 2024 semester with this example. It is fun and accessible, showing how research can be used to study whether personality varies based on astrological signs. I will start by showing them a bunch of funny astrology memes (see above). Then, I'll ask them to think of ways to design a study to prove that astrology is/is not bunk. What sort of data would they need to collect to do this?  Then, I'm going to show them this study ( Joshanloo, 2024 ): https://onlinelibrary.wiley.com/doi/epdf/10.1111/kykl.12395?domain=author&token=BKSRDREAX9F3BKAWGVBD Statsy things to share with your students: 1. Archival data : The used repurposed, vintage, federal data. The General Social Survey, specifically. Data scientists are trained to see the potential of random data sets.    The horoscope sign was simple to determine since the GSS collects birthday data. The author was able to p...