Skip to main content

Posts

Predictions are only as good as the regularity of the event

Weather prediction is data. This makes weather data-related stories and examples highly relatable. The Washington Post published an interactive article t hat shows how accurate weather predictions are for a given city in the United States. This means that we, stats instructors, can use this page to provide a geographically personalized lesson on weather prediction, the limitations of data, and why predictions about the future are only as good as the consistency of the past. I also like this example because it isn't terribly mathy and encourages statistical literacy.  Kommenda and Stevens, writing for the Washington Post, recently shared a story on the accuracy of weather predictions based on time away from the target day. Here, the DV is prediction accuracy, operationalized using the difference between predicted and actual high temperature. You could always ask your students how they would operationalize weather...or maybe some weather matters more than others? Folks in Erie...

Statistical thinking: What data would you need to collect to disprove the predictive power of astrological signs?

Okay. I haven't used this in class yet because it is July, and I just found it. However, I will open the Fall 2024 semester with this example. It is fun and accessible, showing how research can be used to study whether personality varies based on astrological signs. I will start by showing them a bunch of funny astrology memes (see above). Then, I'll ask them to think of ways to design a study to prove that astrology is/is not bunk. What sort of data would they need to collect to do this?  Then, I'm going to show them this study ( Joshanloo, 2024 ): https://onlinelibrary.wiley.com/doi/epdf/10.1111/kykl.12395?domain=author&token=BKSRDREAX9F3BKAWGVBD Statsy things to share with your students: 1. Archival data : The used repurposed, vintage, federal data. The General Social Survey, specifically. Data scientists are trained to see the potential of random data sets.    The horoscope sign was simple to determine since the GSS collects birthday data. The author was able to p...

Not a particularly statsy example, but still delightful.

I mean. This is the most entertaining research methodology I have ever seen. What did this look like? This is what it looked like.  So, this is barely a statsy example, but it does include data outcomes:  n = 175, with some snakes striking the boot ( n = 6) and some coiling ( n = 3). While PIs might try to No IRB would let you get away with asking your graduate student to step on snakes. Mostly, this is funny. I found his research, too . While I think the fake leg is highly amusing, I think it is great that Morris is a passionate advocate for snake education and teaching people to be tolerant of snakes they find in the wild. Finally, I heard about this research on an NPR story about snake handling classes (taught by Morris) in Arizona. A WHOLE CLASS. 

Caffeine, calories, correlation

We need more nonsignificant but readily understood examples in our classes. This correlation/regression example from Information is Beautiful  demonstrates that the calories in delicious caffeinated drinks do not correlate with the calories in the drink. Caffeine has zero calories. The things that make our drinks creamy and sweet may have calories. Easy peasy, readily understood, and this example gives your students a chance to think about and interpret non-significant, itty-bitty effect size findings.  Click here for the data. Aside: Watch your language when using this example. We need calories to stay alive and none of these drinks, in and of themselves, are good or bad. Our students are exposed to way too much of that sort of language and thinking about food and their bodies. What they choose to drink or eat is none of our business. When I share this visual, I omit the information on the far right (exercise) and far left (calorically equivalent foods). It distracts from the...

Law of large numbers, via M&Ms and a GIF.

A quick, accessible example of the Law of Large Numbers. Using candy. Reddit user Jeffrowl counted the proportions of M&Ms across multiple bags, and you can see the proportions of colors reflect the true underlying population as the number of bags increases.  Here is the link , and a screenshot of the GIF can be seen here: I don't use the M&M probability example in class, but  many of you do . This is a nice addition to that example, but it also serves as a brief, standalone example. ALSO, to my nerdy delight, the author's responses include a Methods section: ...as well as information on baseline data: 

How the USAF collects hurricane data with big, big airplanes.

I am an Air Force Brat. Growing up, my dad used to talk about all of the services the USAF provides to our country and the world. It employs many  musicians , advances  airplane safety  for civilians, and conducts and sponsors plenty of research . This post will focus on the USAF's unique position to advance weather and climate science via data collection in big, honkin' airplanes that can fly through hurricanes.  Weather forecasting requires data. As reported by Debbie Elliot for NPR , the Air Force collects data that, specifically, will help us better predict severe weather and save lives.  Aside: This whole mission started on a bet: HOW TO USE IN CLASS: -I tell my students repeatedly that I'm not trying to turn them into the world's best statisticians. I'm trying to help them learn how to be themselves, with their interests and abilities, but fluent in statistical literacy. This lesson goes better when I can have examples of data jobs that aren't traditi...

My other favorite stats newsletter: The Washington Post's How to Read This Chart

 Unlike the Chartr newsletter, I love this as it feeds my fascination with data and provides interesting examples for the class. As I sit here writing (5/11/24), I am enjoying my other favorite stats newsletter, How to Read This Chart . The current newsletter discusses data visualizations used on the front page of the Post. Such as: Philip Bump lovingly curates this newsletter. One time, he found historic, unlabeled charts and asked readers for help interpreting them . I also thought this one, which compared the margin of error and sample sizes used by major national polling firms, fascinating .

If you like this blog, you will love my new podcast...

My friends. I started a podcast.  I've created the Not Awful Data podcast with the help of Garth Neufeld and Eric Landrum at the Psych Sessions podcast empire.  Why? I try to keep my blog posts brief and to the point, but I also love to discuss exactly how I use my favorite data sets in the classroom. This podcast will let me discuss and highlight some of the data sets I've shared on my blog and provide more information on exactly how you could use them in class. Anyway. Every podcast if five minutes long. I plan on posting a new episode once a week. My hope is that this will re-introduce you to some of my older resources and provide you with some out-of-the-box resources you can use in your own teaching. Here is a link to my first episode , which recaps the horror movie/heartbeat data I shared on the blog recently. The podcast is also available on Spotify .

One of my favorite stats mailing lists: Chartr

Chartr|Data Storytelling   Just subscribe. It is entertaining. I mean, look at this: Like, there is a part of my brain that can just doom scroll stats content. Stats scroll? That sounds like an R function. Anyway, that part of my brain loves Chartr

Citizen Scientists, Unite! The Merlin App, Machine Learning, and Bird Calls

Every Spring and Summer, I become obsessed with the Merlin App. This app allows you to record bird songs using your phone and then uses machine learning to identify the bird call. The app can also do visual IDs if your phone has a much better camera than mine.  It is like PokemonGo. I have to catch them all. But no augmented reality, just reality reality.  Here is my "life list" of all the birds I've identified in about a year of using the App: This app brings joy. It is also a quick example of how citizens can become scientists, how Apps can generate data from citizen scientists, and how machine learning makes it work. So, this isn't a lengthy example for class, but it is an accessible example that shows how apps and phones can be harnessed for the better good. And science is super fun. How this App gathers data from users: But how? Via machine learning: Here is even more info on how their machine learning works: AND THEN, the data can be used for scientific research...

Deer related insurance claims from State Farm

We should teach with data sets representing ALL of our students. Why? You never know what example will stick in a student's head. One way to get information to stick in is by employing the self-reference effect .  For example, students who grew up in the country might relate to examples that evoke rural life. Like getting the first day of buck season off from school and learning how to watch out for deer on the tree line when you are going 55 MPH on a rural highway. Enter State Farm's data on the likelihood, per state, of a car accident claim due to collision with an animal (not specifically deer, but implicitly deer) . Indeed, my home state of Pennsylvania is the #3 most likely place to hit a deer with your car. State Farm shares its data per state: https://www.statefarm.com/simple-insights/auto-and-vehicles/how-likely-are-you-to-have-an-animal-collision I am also happy to share my version of the data , in which I turned all probability fractions (1 out of 522) into probabili...

Using pulse rates to determine the scariest of scary movies

  The Science of Scare project, conducted by MoneySuperMarket.com, recorded heart rates in participants watching fifty horror movies to determine the scariest of scary movies. Below is a screenshot of the original variables and data for 12 of the 50 movies provided by MoneySuperMarket.com: https://www.moneysupermarket.com/broadband/features/science-of-scare/ https://www.moneysupermarket.com/broadband/features/science-of-scare/ Here is my version of the data in Excel format . It includes the original data plus four additional columns (so you can run more analyses on the data): -Year of Release -Rotten Tomato rating -Does this movie have a sequel (yes or no)? -Is this movie a sequel (yes or no)? Here are some ways you could use this in class: 1. Correlation : Rotten Tomato rating does not correlate with the overall scare score ( r = 0.13, p = 0.36).   2. Within-subject research design : Baseline, average, and maximum heart rates are reported for each film.   3. ...

Correlation =/= causation, featuring positive psychology, hygge, and no math.

I have shared  AMPLE examples for teaching correlations . Because I've got you, boo. Like, I have shared days' worth of lecture material with you, my people. I am adding one more example. I have used this example in my positive psychology course for years, and it really illustrates what can happen en masse when marketing departments and less-savory pop-psych elements try to establish causal relationships with features (stereotypes?) of happy countries and individuals' subjective well-being. I like this one because it is math-free, UG-accessible, and not terribly long. Joe Pinsker, writing for the Atlantic, argues that... https://www.theatlantic.com/family/archive/2021/06/worlds-happiest-countries-denmark-finland-norway/619299/ TL;DR: Just because Northern European nations consistently score the highest on global happiness data doesn't mean that haphazardly adopting practices from those countries will make you happy. Correlation doesn't equal causation. H ere is the ...