Skip to main content

Posts

Showing posts with the label interactive

Predictions are only as good as the regularity of the event

Weather prediction is data. This makes weather data-related stories and examples highly relatable. The Washington Post published an interactive article t hat shows how accurate weather predictions are for a given city in the United States. This means that we, stats instructors, can use this page to provide a geographically personalized lesson on weather prediction, the limitations of data, and why predictions about the future are only as good as the consistency of the past. I also like this example because it isn't terribly mathy and encourages statistical literacy.  Kommenda and Stevens, writing for the Washington Post, recently shared a story on the accuracy of weather predictions based on time away from the target day. Here, the DV is prediction accuracy, operationalized using the difference between predicted and actual high temperature. You could always ask your students how they would operationalize weather...or maybe some weather matters more than others? Folks in Erie...

Update: Using baby name popularity to illustrate unimodal and bimodal data

I love internet-based teaching ideas. They are free and current. At least they were current when I first posted them, but some of my posts are ten years old.  Such is the case for my old post about the Baby Name Voyage r and how to use it to illustrate unimodal, and bimodal distributions. Instead, please go to NameGrapher to show your students how flash-in-the-plan trendy baby names, like my own, have an unimodal distribution: As opposed to bimodal distributions, which flag a name as a more classical name that enjoyed a resurgence, like Emma: When I use this in class, I frame it between names that were trendy once and names that were trendy one hundred years ago and are again trendy. As a mom to grade-school-aged kids, I have certainly noticed this as a trend in kid names. So many Lilies and Noras!  I also make sure my students understand that this information is gathered via Social Security Administration applications from the federal government, to back up another clai...

Multiverse = multiple correlation and regression examples!

I love InformationIsBeautiful . They created my favorite data visualization of all tim e.  They also created an interactive scatterplot with all sorts of information about Marvel Comic Universe  films. How to use in class: 1. Experiment with the outcome variables you can add to the X and Y axes: Critical response, budget, box office receipts, year of release, etc. There are more than that; you can add them to either the X or Y axes. So, it is one website, but there are many ways to assess the various films. 2. Because of interactive axes, there are various correlation and regression examples. And these visualizations aren't just available as a quick visual example of linear relationships...see item 3... 3. You can ask your students to conduct the actual data analyses you can visualize because  the hecking data is available . 4. The website offers exciting analyses, encouraging your students to think critically about what the data tells them. 5. You could also squeeze Simp...

A fast, interactive example for explaining what we mean when we talk about "training" AI/ML

When I teach regression, I touch on AI/Machine Learning. Because it is fancy regression and ties classroom lessons to real life. During discussions about AI/ML, we often talk about "training" computers to look for something by feeding computers data. Which is slightly abstract. And a bit boring, if you are just talking about a ton of spreadsheets. As an alternative to boring, I propose you ask your students to help train Google's computers to recognize doodles . Visit this website, and a prompt flashes on your screen: You draw the prompt (I used my touchscreen), and Google tries to guess what you drew. Here is my half-done wine glass. Google guessed what it was. The website includes additional information on the data that has already been collected. For every one of the doodles above, you can click through and look at all the ones created in response to each prompt. SO MUCH INFORMATION. If you would like, you can also show your students this explainer video.

Organizations sharing data in a way that is very accessible

A few weeks ago, I posted about how you can share data in such a terrible way that one is not breaking the law, but the data is completely unusable. This makes me think of all the times I am irked when someone states a problem but doesn't offer a solution to the problem. Instead, they just talk about what is wrong and not how it could be. So, as a counter piece, let's cheer on organizations that ARE sharing data in a way that is readily accessible. You could use this in class as a palate cleanser if you teach your students about data obfuscation. You could also use it as a way of helping your students understand how data really is everywhere. Or even challenge them to brainstorm an app that uses readily accessible data in a new way to help folks.  Pro-Publica This website lets you check how often salmonella is found at different chicken processing plants. All you need to do is enter the p-number, company, or location listed on your package of chicken: https://projects.propubli...

History of Data Science's Regression Game

 There are already some pretty cool games for guessing linear relationships/regression lines. Dr. Hill's Eyeball Regression Game . The old, reliable Guess the Correlation game. However, I found a new one that has a particularly gorgeous interface, and a few extra features to help your learners. History of Data Science created the Regression game . It provides the player with a scatter plot, then the player needs to guess the y-intercept and slope. See that regression line? It is generated and changes as the entered a and b values change, which is a good learning tool. If played at the "easy-peasy" level, the player can even change those numbers multiple times over the course of 30 seconds, and watch as the corresponding line changes.  I think this game is a nice way to break up the ol' regression lecture and allows students to see the relationship between the scatter plot and the regression line.

Chi-square and interaction via TV data

This excellent interactive tool provides good examples  of chi-square thinking and interactions.  For more commentary, this  write-up of Orth's researc h  is an easy read. Also, it is an example that doesn't involve politics or COVID. There is a time and place for both in the statistics classroom, but I've missed good, accessible data that doesn't have ANYTHING to do with divisive issues.  The data is from research that asked people how they watch TV: Streaming services, cable TV, binge-watching, etc. All sorts of information on how we watch TV, and with this easy-to-use website, you can create tables from this data. Tables that help you teach all kinds of ideas. You can turn chi-square goodness of fit for whether or not people binge their TV. Into a chi-square test of independence by investigating binging by age (categorized).  You can also change the data visualization so you can see the data in a very traditional chi-square table You can also use these c...

An interactive description of scientific replication

TL;DR: This cool, interactive website asks you to participate in a replication. It also explains how a researcher decision on how to define "randomness" may have driven the main effect of the whole study. There is also a scatter plot and a regression line, talk of probability, and replication of a cognitive example. Long Version:  This example is equal parts stats and RM. I imagine that it can be used in several different ways: -Introduce the replication crisis by participating in a wee replication -Introduce a respectful replication based on the interpretation of the outcome variable  -Data visualization and scatterplots -Probability -Aging research Okay, so this interactive story from The Pudding is a deep dive into how one researcher's decision may be responsible for the study's main effect.  Gauvrit et al. (2017 ) argue that younger people generate more random responses to several probability tasks. From this, the authors conclude that human behavioral complexity...

Eyeball Regression game by Sophie Hill

 Sophie Hill created a great game that shows students how to "eyeball" regression lines (or just lines) by guessing the y-intercept and the slope.  At the beginning of the game, you get a scatter plot. Then, you need to guess the y-intercept and the slope.   Once you make a guess, it will show you the actual line of best fit...and your line, along with residuals and mean squared error. So, this doesn't just allow for eyeballing the regression line but also how to test the fit of a line. P.S.: If you liked this, you'd love the Guess the Correlation game.

Virtual dice and coin flips via Google

Many stats instructors use dice and/or coin flips to teach their students about distributions, probably, CLT. Here is an alternative to physical coins and dice, in case you are teaching from a distance. Certainly, there are countless other websites that will roll a dice or flip a coin for you, but these simple websites created by Google are intuitive and pretty. Using Google's dice rolling simulator, y ou can roll a standard six-sided die. Or a DnD 20 sided die. Or multiple dice.  I included the link, but all you need to do Google "Roll dice" to get to the website. Google also lets you flip a coin.  This simulator doesn't have any fancy options, and you can get to it simply by Googling, "Flip a coin". 

Online Day 6: One-way ANOVA example

I hope everyone is hanging in there. Here is a pretty straight forward one-way ANOVA example that is interactive, based on for-real personality psychology research, and interesting. I blogged about this previously but whipped up a Google Slideshow you can download and edit to suit your own teaching. Also, I uploaded data that you can use with your students.  TL:DR- A bunch of researchers gave the NEO to 1.5 million Americans to determine if different regions of the US have different personality trends (see research here ).  Original Study They do. Then Time magazine reported on the study . And the scicomm was beautiful. They accurately described the research AND created a fun interactive portion in which you students can take the NEO-Short Form and be matched with the state that best matches their personality (Hi, I am West Virginia because I'm high in neuroticism and low in openness to new experiences, which are great qualities to have during a pandem...

NYT American dialect quiz as an example of validity and reliability.

TL:DR: Ameri-centric teaching example ahead: Have your students take this quiz, and the internet will tell them which regions of the US talk the same as them. Use it to teach validity. Longer Version: The NYT created a gorgeous version ( https://www.nytimes.com/interactive/2014/upshot/dialect-quiz-map.html ) of a previously available quiz ( http://www.tekstlab.uio.no/cambridge_survey/ ) that tells the user what version of American English they speak. The prediction is based upon loads and loads of survey data that studies how we talk. It takes you through 25 questions that ask you how you pronounce certain words and which regional words you use to describe certain things. Here are my results: Indeed, I spent elementary school in Northern Virginia, my adolescence in rural Central PA, college at PSU, and I now live in the far NW corner of PA. As this test indeed picked up on where I've lived and talked, I would say that this is a  valid  test based just on my u...

CNN's The most effective ways to curb climate change might surprise you

CNN created an interactive quiz that will teach your students about a) making personal changes to support the environment, b) rank-order data, and c) nominal data. https://www.cnn.com/interactive/2019/04/specials/climate-change-solutions-quiz/ The website leads users through a quiz. For eight categories of environmental crisis solutions, you are asked to rank solutions by their effectiveness. Here are the instructions: Notice the three nominal categories for each solution: What you can do, What industries can do, What policymakers can do. Below, I've highlighted these data points for each of the "Our home and cities" solutions. There are also many, many examples of ordinal data. For each intervention category, the user is presented with several solutions and they must reorder the solutions from most to least effective. How the page looks when you are presented with solutions to rank order: The website then "grades" your respons...

Passion driven statistics

Passion-Driven Statistics is a grant-funded, FREE resource that teaches the basic of statistics, including the basics of all of the stuff you need to know to conduct good research (data management, literature review, etc.). It bills itself as "project-driven" and is super, duper applied, which is an approach I love. You can download the whole stinking book  or view it online. And the PDF is concise and short, given the amount of material it covers. Why so short? Because it is lousy with links to Youtube videos, mini-assignments, instructions for reporting different statistical tests, etc.  I also love this resource because it contains a lot of good information for novices that I haven't seen packaged this way or in one place: Important lessons pertaining to the research process and data collection: The book is written to take you through a research project, and includes guidance for performing a literature review, writing a sound codebook, data management, etc. ...

The Washington Post, telling the story of the opioid crisis via data

I love dragging on bad science reporting as much as anyone, but I must give All Of The Credit to the Washington Post and its excellent, data-centered reporting on the opioid epidemic . It is a thing of beauty. How to use in class: 1) Broadly, this is a fine example of using data to better understand applied problems, medical problems, drug problems, etc. 2) Specifically, this data can be personalized to your locale via WaPo's beautiful, functional website . 3) After you pull up you localized data, descriptive data abound...# of pills, who provided them, who wrote the scripts (y'all...Frontier Pharmacy is like two miles from my house)...   4) Everyone teaches about frequency tables, right? Here is a good example: 5) In addition to localizing this research via the WaPo website, you can also personalize your class by looking for local reporting that uses this data. For instance, the Erie newspaper reporter David Bruce reported on our local problem ( .pdf of the...