Skip to main content

Posts

Showing posts from 2025

Does unusually heavy traffic at pizzerias near the Pentagon predict global military activity?

While most of my class time is dedicated to the specifics of performing and interpreting inferential tests, basic statistical literacy and thinking are equally important lessons. Here are some of the big-picture literacy ideas I want my students to think about in my stats classes: 1. How can we use data to understand patterns to make predictions? 2. How can we separate the signal from the noise?  3. How can data actually inform real life and current events? 4. How can we repurpose existing data in a world where data is everywhere? Here is an example I JUST found that addresses all of these ideas. The  Pentagon Pizza Report is an X account that monitors Google "Popular times" data in pizzerias near the Pentagon to predict military activity.  The X account asserts that unusually high, later-than-normal foot traffic at pizzerias near the Pentagon (x) may indicate that Pentagon military staff are working late and need to grab take-out for dinner(y).  Most recently, the...

An ode to Western Pennsylvania, in chi-square form

I've been writing this blog, statistics pedagogy articles, chapters, and a whole statistics textbook for over ten years. I'm at the point where I see silly stuff on the internet, and it automatically translates to a statistics example. Like this recent Tweet from Sheetz about the Pirates/Philly series this weekend. https://x.com/sheetz/status/1923397811778785489 This is an unapologetically Western PA tweet. I will be using it as a chi-square goodness-of-fit example with my Western PA students at Gannon University this Fall. I even created a data file that mimics the findings (Methods:  n  = 380, Results: p < .001. Conclusion: Sheetz followers on Twitter love some curly fry). If you are a poor, unfortunate soul who has never enjoyed treatz from Sheetz, I feel bad for you. Look up your favorite regional brands on Twitter and translate one of their polls into a chi-square example. Or travel to your nearest Sheetz to experience some damn joy. 

Full Discussion Board Idea #3: Deer-related car accidents by state.

State Farm, a prominent American insurance provider, shared data that ranked American states based on the number of animal-related (mostly deer) car accident claims filed per state .  I blogged about this data previously , and I am returning to it now as part of my semi-regular Discussion Board Ideas series on this blog. I have been using this prompt in my online stats class in NW PA for about a year now. I'm going to share some of that success here. Note: PA is #4 for deer-related car accident claims, so this data resonates with my students.  I use this for the fifth of seven weeks in my online class, so the students are comfortable with the class format and one another by then. Here is the exact prompt I use: I have a weird question for you: How do you think Pennsylvania ranks when it comes to the number of car accident insurance claims involving colliding with animals? Yes, I am on my soapbox about safe night-time driving in PA. Once you have your guess, check against...

PWA data visualizations on YouTube

A clever YouTuber, PWA , built a channel with nearly a million followers based on animated videos that compare nations based on data. Every nation is a sassy sphere. Each grows and shrinks in size, in comparison with other nations, as the data is presented. Like this image, illustrating national debt as a portion of GDP... I swear, it is funny and engaging without trying too hard. Also, for better or worse, framing data sharing and visualization as a thing that can make you a successful influencer WILL grab your students' attention. I think these videos would make good  bell-ringer s ( TM Janet Peters) for the start of your class. This influencer makes a ton of videos, and they aren't all related to data, FYI. Here are a few good examples for Stats class:  National Debt: In this clip, the countries compare their national debt. This video discusses some of the choices statisticians make with their data. For example, they compare their national debt in USD and then compare...

Dima Yarovinsky's "I Agree": Data visualization meets installation art piece.

Look at how Dima Yarovinsky turned the Terms and Conditions documents for several social media platforms into foreboding and beautiful art/bar graphs illustrating how much we sign away without reading. Note: He even uses the X axis to describe the length of and reading time for each T&C statement!  I think data is beautiful. This example does a good job of showing the beauty and impact of good data visualizations to my students. This isn't a huge example to use in class, but I will use it next time I discuss bar graphs. For more from the artist, in his own words, visit his webpage .  For a thought review of this art, see this article by Emma Taggart .

Leo DiCaprio Romantic Age Gap Data: UPDATE

Does anyone else teach correlation and regression together at the end of the semester? Here is a treat for you: Updated data on Leonardo DiCaprio, his age, and his romantic partner's age when they started dating. A few years ago, there was a dust-up when a clever Redditor r/TrustLittleBrother realized that DiCaprio had never dated anyone over 25. I blogged about this when it happened. But the old data was from 2022. Inspired by this sleuthing,  I created a wee data set, including up-to-date information on his current relationship with Vittoria Ceretti, so your students can suss out the patterns that exist in this data.

A wee bit of Positive Psychology data related to money and death.

One of my favorite upper-level elective courses to teach is Positive Psychology. I recently came across a comprehensive account of various facets of how positive psychology can be assessed in nations:  https://ourworldindata.org/happiness-and-life-satisfaction . Like, the website is just great. Below is an example of the data you can explore, in various formats, animation options, and you can download the data. It is great! From this website, I download loaded and compiled two data sets that caputure GDP, Cantrill Ladder Score, and life span data for hella countries. You can perform a variety of significant and non-significant correlations and regressions using this data. Additionally, the countries are divided into six regions, allowing you to conduct some one-way ANOVAs with your students.  Here is the data, compiled by my awesome RA, Maddie:  https://docs.google.com/spreadsheets/d/129NQcPdFwZjyzZAJdX6odKC7KiFk_Q1Lqa-SD4kk5FQ/edit?usp=sharing

Full Discussion Board Idea #2: Trends in love songs, as illustrated by The Pudding

  You aren't a proper stats nerd if you have not scrolled for an hour through all of  The Pudding's  content .  Thank goodness for The Pudding, which helped me spice up the discussion boards in my online stats class. For a long time, I emphasized rigor over wonder. In my stats class, I had functionally reasonable but not terribly engaging topics for class discussion. That changed last semester. I spiced up my discussion board with some of my favorite data visualizations, like this one about using a fast food app to track power outages after a natural disaster and this one that illustrates data on the efficacy of nutritional supplements in a beautiful and functional way. Here is another that lets students look at trends in art and wonder about how this may reflect on cultural shifts in courting and romantic relationships . TL;DR The Pudding recently shared a post about trends in love songs from 1958 through 2023. The whole interactive is very engaging and lets yo...

r/DataIsUgly

I have found plenty of class inspiration on Reddit. Various subs have provided a  new way to explain mode   and median  and great, intuitive data to teach  correlation . However, much as a reverse-coded item on a scale can be used to get to the opposite of what you are asking about, r/DataIsUgly is rife with examples of how NOT to do data as to teach how to create good data visualizations. Very recently, I shared this example from r/DataIsUgly to illustrate why NOT to truncate the Y axis .  And...this sub is filled with people like us. People who love to proofread and notice data crimes. For example: How to use it in class? Can your students figure out why these data visualizations are...less than optimal? Can they fix them? They could be a fun prompt for extra credit points or a discussion board.

Annual snow fall moderates the relationship between daily snow fall and the likelihood of canceling school

Moderation isn't one of those things that we typically teach in Intro Stats. But it is a statistical tool your advanced undergraduates will likely encounter in an upper-level course. I'm not going to teach you how to teach your students how to do one. I am, however, going to share a  example of what mediation is doing, inspired by living in the city in the US that has received the most snow this season (Erie, PA, with 93.9 inches for the season as of 1.30.25).  About a year ago, CNN shared data on how much snow it takes to cancel school in various parts of the country. I assure you, Erie and the rest of Northwest PA (see red outline) gets hella snow but no snow days. https://www.cnn.com/2024/02/12/us/how-much-snow-kids-school-snow-day-across-us-dg/index.html However, our lack of snow days isn't due to lack of snow. The annual amount of snow moderates the likelihood to cancel school, such that if you are used to a lot of snow (and have the infrastructure to handle it) you d...

Absolute vs. relative risk reporting: Lake effect snow edition

I maintain that relative versus absolute risk is a concept that we absolutely must teach in intro stats. I have given some examples of this before ( murder ! COVID !) but here is another one that hits home for my fellow Great Lakers. In particular, this one is for my friends in Detroit, Toledo, Cleveland, and Buffalo  Up here in Erie, a common point of discussion is how frozen Lake Erie is. Because once it freezes, the lake's moisture no longer feeds Dread Lake Effect Snow.   I like this example because you can easily perform the math in front of your class, demonstrating that 26.14% is 103.45% of 12.85%. At the same time, you have the visual to demonstrate that the vast majority of the lake was still unfrozen even with a 103.45% increase. 

Full Discussion Board Idea #1: Repurposing gently-used, second-hand data during times of crisis

I can't be the only one teaching online statistics this Spring. Last fall, I refreshed ALL of my discussion boards for my online version of Psychological Statistics. I haven't done so since 2020, and my students responded well to my new discussion topics, all of which are centered in statistical literacy and improving problem-solving with data. My first one is based on this old blog post about how residents of Houston used a Whataburger location map to figure out which parts of Houston were without electricity following Hurricane Meryl. Here is how I presented it to my students: You never know where valuable data visualizations will come from! For instance, following Hurricane Beryl, Texans used the Whataburger app to track power outages across Houston. Whataburger is a popular restaurant chain in the South. Its app has a feature where users can quickly see open or closed locations. Normally, this is used by hungry people to find the closest, open location. HOWEVER: There are S...

Data can be equity: Merging of Major League Baseball and Negro League Baseball data.

I know it is January 2025, but I want to write about something that happened during the Spring of 2024. I think it is a story about how it is never too late to do the right thing, making it great thing to think about here at the New Year. Data can't undo the past, but the way we manage it moving forward can provide the opportunity for some measure of equity. Back in May, professional baseball decided to include Negro League (NL), which existed from 2910 to 1948, baseball stats as part of Major League Baseball (MLB) stats. This is was done to allow for proper recognition of talented ML players. This changed some storied records for the league: https://www.mlb.com/news/stats-leaderboard-changes-negro-leagues-mlb This was a lot more than merging a couple of spreadsheets. As such, this story also serves as a lesson in data management and making desperate datasets the same. One that is a lot more moving than your typical story of data-cleaning. The following screenshots are from:  ...