Skip to main content

Posts

Citizen Scientists, Unite! The Merlin App, Machine Learning, and Bird Calls

Every Spring and Summer, I become obsessed with the Merlin App. This app allows you to record bird songs using your phone and then uses machine learning to identify the bird call. The app can also do visual IDs if your phone has a much better camera than mine.  It is like PokemonGo. I have to catch them all. But no augmented reality, just reality reality.  Here is my "life list" of all the birds I've identified in about a year of using the App: This app brings joy. It is also a quick example of how citizens can become scientists, how Apps can generate data from citizen scientists, and how machine learning makes it work. So, this isn't a lengthy example for class, but it is an accessible example that shows how apps and phones can be harnessed for the better good. And science is super fun. How this App gathers data from users: But how? Via machine learning: Here is even more info on how their machine learning works: AND THEN, the data can be used for scientific research...

Deer related insurance claims from State Farm

We should teach with data sets representing ALL of our students. Why? You never know what example will stick in a student's head. One way to get information to stick in is by employing the self-reference effect .  For example, students who grew up in the country might relate to examples that evoke rural life. Like getting the first day of buck season off from school and learning how to watch out for deer on the tree line when you are going 55 MPH on a rural highway. Enter State Farm's data on the likelihood, per state, of a car accident claim due to collision with an animal (not specifically deer, but implicitly deer) . Indeed, my home state of Pennsylvania is the #3 most likely place to hit a deer with your car. State Farm shares its data per state: https://www.statefarm.com/simple-insights/auto-and-vehicles/how-likely-are-you-to-have-an-animal-collision I am also happy to share my version of the data , in which I turned all probability fractions (1 out of 522) into probabili...

Using pulse rates to determine the scariest of scary movies

  The Science of Scare project, conducted by MoneySuperMarket.com, recorded heart rates in participants watching fifty horror movies to determine the scariest of scary movies. Below is a screenshot of the original variables and data for 12 of the 50 movies provided by MoneySuperMarket.com: https://www.moneysupermarket.com/broadband/features/science-of-scare/ https://www.moneysupermarket.com/broadband/features/science-of-scare/ Here is my version of the data in Excel format . It includes the original data plus four additional columns (so you can run more analyses on the data): -Year of Release -Rotten Tomato rating -Does this movie have a sequel (yes or no)? -Is this movie a sequel (yes or no)? Here are some ways you could use this in class: 1. Correlation : Rotten Tomato rating does not correlate with the overall scare score ( r = 0.13, p = 0.36).   2. Within-subject research design : Baseline, average, and maximum heart rates are reported for each film.   3. ...

Correlation =/= causation, featuring positive psychology, hygge, and no math.

I have shared  AMPLE examples for teaching correlations . Because I've got you, boo. Like, I have shared days' worth of lecture material with you, my people. I am adding one more example. I have used this example in my positive psychology course for years, and it really illustrates what can happen en masse when marketing departments and less-savory pop-psych elements try to establish causal relationships with features (stereotypes?) of happy countries and individuals' subjective well-being. I like this one because it is math-free, UG-accessible, and not terribly long. Joe Pinsker, writing for the Atlantic, argues that... https://www.theatlantic.com/family/archive/2021/06/worlds-happiest-countries-denmark-finland-norway/619299/ TL;DR: Just because Northern European nations consistently score the highest on global happiness data doesn't mean that haphazardly adopting practices from those countries will make you happy. Correlation doesn't equal causation. H ere is the ...

The limitations of regression...a mega remix

 I enjoy fun ways to refer to the fact that regressions can't be predicted forever. Like, trends have to stop, right? Here is a v. recent one: Thank you, @ronburke! Thank you,  @RomanFolw · https://www.nature.com/articles/431525a/figures/1

Factorial ANOVA, Tai Chi, and the importance of base rates

I love JAMA Visual Abstracts . I have blogged about them before. They are great ways to illustrate 1) basic, intro stats topics, 2) excellent sci-comm, and 3) psych-adjacent medical examples.  I learned about a recent JAMA publication on NPR (which you could play for your students) . It compared blood pressure in people who were in a Tai Chi exercise condition versus an aerobic exercise condition: https://jamanetwork.com/journals/jamanetworkopen/fullarticle/2814872 Here are some ways you could use it in class: 1. Simple factorial ANOVA research design. Two groups with a repeated measure design makes me think "factorial ANOVA."  I have not, but it would be easy to make a 2 x 2 bar graph with this data (the actual data is embargoed until December).  2. Active control group : The control group wasn't sitting on a couch. The control group was doing aerobic activities.  3. Lots of outcomes and potential for significance (and Type II error) : The main thrust of this pap...

Explaining the median using a German game show.

This is a very brief example to spice up the measures of central tendency lecture. There is a game show in Germany, and one of the rounds of the game show is performing a perfect median split on food. OF COURSE, IT IS A BAVARIAN HOT PRETZEL. The "splitting championship" game is part of a larger video game. Here is the YouTube version and here is the Reddit version, with more deets on the game show. To be clear, we aren't talking about eye-balling here. The median split is an exact split by weight. Just as a statistical median split is an exact splitting of a data set. Here is a more exact screen grab:  ALSO: Because I love a good internet rabbit hole, the Reddit source I found actually goes into detail about the German game show. Have fun. 

Teaching Pre-Conference at SPSP 2024

Hey, all- Here is today's (2.8.24) presentation  about working more statistics into your social psychology course. I'm mostly posting this for the folks who went to the conference because I told them I would, but feel free to use this advice to add some novel stats examples to your social psychology classes.

Social Comparison Theory: T-test, ANOVA, and a very common way to trichotomize data.

Hey!  I'm giving a keynote at the February annual teaching pre-conference at the Society for Personality and Social Psychology conference. It's all about social psychology stats example. Like this one! This one demonstrates social comparison theory without ever saying social comparison theory. YouGov published data  ( here is the full data source ) that asked participants to rate their own, close-other, and far-others on several factors related to modern life (see below). In doing so, they unknowingly trigger social comparison theory, and in particular, downward social comparison. TL;DR: We know ourselves and how well we are doing compared to other people. And people are motivated to feel good about themselves.     https://today.yougov.com/society/articles/48400-americans-compare-own-outlook-with-country-poll These findings smack of downward social comparison, right? Instead of having a specific target we are comparing ourself to, like a co-worker or a neighbor,...

In which I compare t-curves with Brazilian butt lifts.

OK. This wasn't my original idea, but I love it so much that I'm blogging about it. The original idea came from Dr. Andrea Sell, who, in turn, got this idea from one of her brilliant student, Johanna Perez.  How t -distributions are like Brazilian Butt Lifts: A treatise.  First, familiarize yourself with the Brazilian Butt Lift: The fat doesn't leave. As illustrated below, the fat just moves...into the tail.  https://ariamedtour.com/blogs/why-is-bbl-popular/ Is this not what William Gosset did when he created the t -curve? Instead of moving around fat, he moved around probability under the normal curve. He moved that probability into the tails . Both Igo Pitanguy (inventor of the Brazilian Butt Lift) and William Gosset (inventor of the t-test) moved things around as to...CREATE A THICKER (thiccer?) TAIL. THIS IS SUCH A PERFECT METAPHOR. See:

Update: Using baby name popularity to illustrate unimodal and bimodal data

I love internet-based teaching ideas. They are free and current. At least they were current when I first posted them, but some of my posts are ten years old.  Such is the case for my old post about the Baby Name Voyage r and how to use it to illustrate unimodal, and bimodal distributions. Instead, please go to NameGrapher to show your students how flash-in-the-plan trendy baby names, like my own, have an unimodal distribution: As opposed to bimodal distributions, which flag a name as a more classical name that enjoyed a resurgence, like Emma: When I use this in class, I frame it between names that were trendy once and names that were trendy one hundred years ago and are again trendy. As a mom to grade-school-aged kids, I have certainly noticed this as a trend in kid names. So many Lilies and Noras!  I also make sure my students understand that this information is gathered via Social Security Administration applications from the federal government, to back up another clai...

A recording of a statsy talk I gave at Murray State University.

 Hey. Most of you have never met me and only read my words on this blog, so I thought it would be fun to share a recording of a talk I gave at Murray State University in October of this year .  Not only do you get to see/hear me in action, I think this talk does a great job of summing up my approach to statistics and what I want my students to get out of my class. If you agree with my approach, may I gently suggest that you sign yourself up to get updates on  my forthcoming WW Norton Psychological Statistics textbook: https://seagull.wwnorton.com/l/710463/2023-10-26/2tp3nt

Generate highly personalized music data using Exportify

Spotify generates gobs of data about music.  Most people have seen the end-of-the-year data Spotify generates for each user about their listening patterns . Most people don't know that Spotify also generates a lot of data about individual songs. Some of it is straightforward: tempo, genre, length. However, Spotify also has its own niche way of quantifying songs: Danceability. Accousticness. Here is a whole list of their variables and descriptions from researchers at CMU:  https://www.stat.cmu.edu/capstoneresearch/315files_s23/team23.html What does this mean for a stats teacher? You have access to highly personalizable data sets, rooted in music, with gobs and gobs of variables for each song...or artist...or album...or year of release...or genre (like, so many ways to divide up your data).  For instance,  I created a data set with Spotify data for 1989 and 1989 (Taylor's Version) to teach paired  t -tests . How do Taylor's re-recordings compare to the originals?...