Skip to main content

Posts

Showing posts with the label chi-square

An ode to Western Pennsylvania, in chi-square form

I've been writing this blog, statistics pedagogy articles, chapters, and a whole statistics textbook for over ten years. I'm at the point where I see silly stuff on the internet, and it automatically translates to a statistics example. Like this recent Tweet from Sheetz about the Pirates/Philly series this weekend. https://x.com/sheetz/status/1923397811778785489 This is an unapologetically Western PA tweet. I will be using it as a chi-square goodness-of-fit example with my Western PA students at Gannon University this Fall. I even created a data file that mimics the findings (Methods:  n  = 380, Results: p < .001. Conclusion: Sheetz followers on Twitter love some curly fry). If you are a poor, unfortunate soul who has never enjoyed treatz from Sheetz, I feel bad for you. Look up your favorite regional brands on Twitter and translate one of their polls into a chi-square example. Or travel to your nearest Sheetz to experience some damn joy. 

YouGov America's Thanksgiving-themed chi-square examples

YouGov gifts us with seasonal chi-square examples  with data on Thanksgiving food controversies. For example: How do people feel about marshmallows on sweet potato dishes? This doesn't look randomly distributed to me. Which is more beloved: Light or dark turkey meat? If you want examples for the chi-square test of independence, dig into the PDF containing ALL of this survey's data. The distribution of people who like cranberry sauce by age group does not appear random.

How to investigate click-bait survey claims

Michael Hobbes shared a Tweet from Nick Gillespie. That Tweet was about an essay from The Bulwark . That Tweet plays fast and loose with Likert-type scale interpretation. The way Hobbes and his Twitter followers break down the issues with this headline provides a lesson on how to examine suspicious research clickbait that doesn't pass the sniff test. First off, who says "close to one in four"? And why are they evoking the attempt on Salman Rushdie's life, which did not happen on a college campus and is unrelated to high-profile campus protests of controversial speakers?  Hobbes dug into the survey cited in the Bulwark piece. The author of the Bulwark piece interpreted the data by collapsing across response options on a Likert-type response scale. Which can be done responsibly, I think. "Very satisfied" and "satisfied" are both happy customers, right? But this is suspicious. Other Twitter users questioned the question and how it may leave room for i...

Chi-square and interaction via TV data

This excellent interactive tool provides good examples  of chi-square thinking and interactions.  For more commentary, this  write-up of Orth's researc h  is an easy read. Also, it is an example that doesn't involve politics or COVID. There is a time and place for both in the statistics classroom, but I've missed good, accessible data that doesn't have ANYTHING to do with divisive issues.  The data is from research that asked people how they watch TV: Streaming services, cable TV, binge-watching, etc. All sorts of information on how we watch TV, and with this easy-to-use website, you can create tables from this data. Tables that help you teach all kinds of ideas. You can turn chi-square goodness of fit for whether or not people binge their TV. Into a chi-square test of independence by investigating binging by age (categorized).  You can also change the data visualization so you can see the data in a very traditional chi-square table You can also use these c...

Chi-square example involving American's beliefs about vampires. Seriously.

 OK. I'm proud of this one. I think these are good chi-square examples. And they are about Americans' beliefs about various supernatural beings. Presented with commentary because I couldn't help myself. 1. Supernatural beliefs by age Already. I fully intended to tease my traditional UGs about their beliefs about vampires. Because I do believe that would probably be a significant chi-square...but... ...IS GEN X OK? This survey asks about ghosts, demons, psychics, vampires, and werewolves. What are the "OTHER" this survey is talking about? Aliens? Dolly Parton? I'm intrigued. 2. Werewolves: The Unity Horse. The Unity Wolf? Both Trump and Biden supporters are united in their belief that werewolves do NOT exist.  Anyway. This survey is intriguing. There is a lot of material to work with.  Go read it here .  PS: Hey! If you like this idea and would love a whole stats textbook from the brain of the person who came up with this idea, sign up for more information abou...

Chi-square Test of Independence using CNN exit polling data

If you are trying to explain the Chi-Square Test of Independence to your students, here are some timely examples that are political and not polarizing. Well, I don't think it is polarizing. I'm sure there are people out there that disagree. Maybe some of the questions are polarizing? Regardless, it is nice to have an example that uses a current event with easy to understand data.  The example comes from  CNN. The network conducted exit polling during the 2020 presidential election . I'm sure they didn't intend to provide us with a bunch of chi-square examples, but here we are. Essentially, CNN divided Biden and Trump voters into many categories with not a parameter to be had. I have included a few of the tables here, but there are many others on the website .  They illustrate different designs (2x2, 2x3, 2x4, etc.) and different magnitudes of difference between expected and observed values. 

Stand-alone stats lessons you can add to your class, easy-peasy.

I started this blog with the hope of making life easier for my fellow stats instructors. I share examples and ideas that I use in my own classes in hopes that some other stats instructor out there might be able to incorporate these ideas into their classes. As we crash-landed into the online transition last Spring, I created took some of the blog posts and made them into lengthier class lessons, including Google Slides and, when applicable, data sets shared via my Google Drive. I ended up with four good lessons about the four big inferential tests typically cover in Psych Stats/Intro Stats: T-test, ANOVA, chi-square, and regression. I think these examples serve as great reviews/homework assignments/an extra example for your students as they try to wrap their brain around statistical thinking. As we are preparing for the Fall, and whatever the Fall brings, I wanted to re-share all of those examples in one spot. Love, Jess ANOVA https://notawfulandboring.blogspot.com/2020/04/online-day-6...

Using Pew Research Center Race and Ethnicity data across your statistics curriculum

In our stats classes, we need MANY examples to convey both theories behind and the computation of statistics. These examples should be memorable. Sometimes, they can make our students laugh, and sometimes they can be couched in research. They should always make our students think. In this spirit, I've collected three small examples from the Pew Research Center's  Race and Ethnicity  archive (I hope to update with more examples as time permits). I don't know if any data collection firm is above reproach, but Pew Research is pretty close. They are non-partisan, they share their research methodology, and they ask hard questions about ethnicity and race. If you use these examples in class, I think that it is crucial to present them within context: They illustrate statistical concepts, and they also demonstrate outcomes of racism.   1. "Most Blacks say someone has acted suspicious of them or as if they weren't smart" Lessons: Racism, ANOVA theory: between-group dif...

Online Day 7: Chi-Square Examples

Here are two good review examples for chi-square, one for goodness-of-fit, and one for the test of independence. Here is my Google Slide presentation, which includes links to data sets for the examples. One features Taco Bell . The other features actual Developmental Psychology research, as featured on NPR . When I use these in class, the students have already been introduced to chi-square, have been walked through examples of both chi-squares, and then they analyze the data on their own using JASP. Y

Mother Jones' mass shooting database

Mother Jones' magazine maintains a database of mass shooting events in the United States. 25 variables are collected from every shooting MJs collects 25 variables from every shooting. Below, I've included their own description of the purpose of their database: How to use in class: Within this data is an example for every test we teach in Introduction to Statistics. Correlation/Regression Fatalities Injuries Age of shooter Year of shooting Chi-square Shooter gender Shooter ethnicity Mass or Spree shooting Were the weapons obtained legally? ANOVA Shooter ethnicity T-test Mass or Spree shooting Were the weapons obtained legally? Data Cleaning  Some of these columns need some work before analysis. For instance, there are multiple weapons listed under "Weapon Type". Which is reasonable, but not helpful for descriptive data. You could walk your students through the process of recoding that column into multiple columns. You could also expl...

Diversity in Tech by DataIsBeautiful

I am a fan of explaining the heart of a statistical analysis conceptually with words and examples, not with math. Information Is Beautiful has a gorgeous new interactive, Diversity in Tech , that uses data visualization to present gender and ethnic representation among employees at various big-name internet firms. I think this example explains why we might use Chi-Square Goodness of Fit. I think it could also be used in an I-O class. So, what this interactive gives you is a list of the main, big online firms. And then the proportions of different sort of people who fall into each category. See below: When I look at that US Population baseline information, I see a bunch of expected data. And then when I see the data for different firms, I see Observed data. So, I see a bunch of conceptual examples for chi-square Goodness of Fit. For example, look at gender. 51% of the population is female. That is you Expected data. Compare that to data for Indiegogo. They have 50% female e...

Likelihood of Null Effects of large

This example provides evidence of data funny business beyond psychology, shows why pre-registration is a good thing, AND uses a chi-square. Bonus points for being couched in medicine and prominently featuring randomized controlled trials (RCT). Basically, Kaplan and Irving's  research checked out the results for RCTs funded with grants from the National Heart, Lung and Blood Institute. See below for how they selected their studies: And what did they find? When folks started registering their outcomes, folks started to get fewer "beneficial" results. Which probably REALLY means that some of those previous "beneficial" results were not so beneficial, or the result of some data massaging. See below: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0132382           Another reason to love this example: It is a real life chi-square that is easy to understand! I feel like I don't have enough great chi-square examples in my lif...

Taco Bell and Chi-Square. Because of course this moment was coming.

Do you know what we need as statistic instructors? A) More chi-square examples that are b) rooted in Taco Bell condiments and c) are null. So here you go, as inspired by this tweet : This data did not achieve statistical significance, X^2 (3, N = 32) = 0.33, p = 0.95. The data suggests that these Taco Bell packets are randomly distributed. If you do this analysis by hand, here is your data: Diablo = 8, Hot = 9, Fire = 7,  Mild = 9. If you do this analysis via software, here is the .csv version of the data , here is the .jasp version of the data , and here is a version of the data you can just copy and paste. Sauce Diablo Diablo Diablo Diablo Diablo Diablo Diablo Diablo Fire Fire Fire Fire Fire Fire Fire Fire Fire Hot Hot Hot Hot Hot Hot Hot Mild Mild Mild Mild Mild Mild Mild ...

Explaining chi-square is easier when your observed data equals 100 (here, the US Senate)

UPDATE: 2020 Data: https://www.catalyst.org/knowledge/women-government When I explain chi-square at a conceptual, no-software, no-formula level, I use the example of gender distribution within the US Senate. There are 100 Senators, so the raw observed data count is the same as the observed data expressed via proportions. I think it makes it easier for junior statisticians to wrap their brains around chi-square.  I  usually start with an Goodness-of-Fit (or, as I like to call them, "One-sies chi-squares").For this example, I divide senators into two groups: men and women. And what do you get?  For the 115th Congress, there are 23 women and 77 men . There is your observed data, both as a raw count or as a proportion. What is your expected data? A 50/50 breakdown...which would also be 50 men and 50 women. Without doing the actual analysis, it is pretty safe to assume that, due to the great difference between expected and observed values, your chi-square Goodness o...

Two great websites that generate data sets for teaching.

You could also use these websites to generate totally unethical data for publication. Don't do it, buddy. Sometimes, it is lovely to have some data generated to teach your stats class when you are teaching. You know the data for a particular statistical test and the results. Here are two websites that do just that. One tried and trustworthy resource was created by   I/O psychologist Richard Landers.  I  blogged about this one  in 2013, and I've used his data generator for years. My new resource is from social psychologist Andrew Luttrell. Nice things about both: -Data! -Both are easy to use. -Specific data for everything you teach in Intro Stats, like t-tests, ANOVA, correlation, and regression. -They are both free and help you do your job. Thanks, Richard and Andrew! The nice thing about Richard's is that it gives you options of several different units (days, money, etc.) AND vignettes that explain why this data was collected. You can generate data ...

Chi-square example via dancing, empathetic babies

Don't you love it when research backs up your lifestyle? My kids LOVE dancing. We have been able to get both kids hooked on OK GO and Queen and Metallica. The big kid's favorite song is "Tell Me Something Good" by Chaka Khan and the little kid prefer's "Master of Puppets". We all like to dance together. My kids, husband, and sister dancing. Now, research suggests that our big, loud group activity may increase empathy in our kids. NPR summarized Dr. Laura Cirelli's research looking at 14 m.o.'s and whether they 1) helped or 2) did not help a stranger who either 1) danced in sync with them or 2) danced, but not in sync, with the child. She found (in multiple studies) that kids offer more assistance after they danced in sync with an adult.  How to use in class: 1) Here is fake chi-square, test of independence, data you can use in class. It IS NOT the data from the research but mimics the findings of the research. "Synced?" re...

Chi-square via The Onion's "Saying ‘Smells Okay’ Precedes 85% Of Foodborne Illnesses Annually"

Once again, The Onion publishes satire research (which should be, like, a submission category for JPSP) claiming to study phrases uttered before food poisoning happens . https://www.theonion.com/report-saying-smells-okay-precedes-85-of-foodborne-1819579726 I've turned this fake research into fake data to conduct an actual chi-square test of goodness of fit. Here is data that will give you a significant chi square, with 85% of participants falling into the "smells okay" category. Did sick person say aid "Smells Okay" before eating leftovers? No Yes 19 106

Dozen of interactive stats demos from @artofstat

This website is associated with Agresti, Franklin, and Klinenberg's text Statistics, The Art and Science of Learning from Data ( @artofstat ), and there are dozens of great interactives to share with your statistics students. Similar and useful interactives exist elsewhere, but it is nice to have such a thorough, one-stop-shop of great visuals. Below, I have included screengrabs of two of their interactive tools. They also explain chi-square distributions, central limit theorem, exploratory data analysis, multivariate relationships, etc. This interactive about linear regression let's you put in your own dots in the scatter plot, and returns descriptive data and the regression line, https://istats.shinyapps.io/ExploreLinReg/.  Show the difference between two populations (of your own creation), https://istats.shinyapps.io/2sample_mean/

CNN, exit polls, and chi-square examples.

CNN posted a whole mess of exit polling data that illustrates how different demographics voted last night. And through my "I teach too many stats classes" lense, I see many examples of chi-square. I think they work at a conceptual level to clearly illustrate how chi-square looks at people falling into different categories, then measures whether the distribution of people is by chance or not. If you actually wanted to test these using chi square, I would suggest you should problem delete the other/no answer column (or else they will all come out as statistical significant, I would wager). EDIT (11/14/16); Daniel Findley made of video demonstrating how to use Excel to conduct chi-square tests on the marital status data. Check it out here .

UCLA's "What statistical analysis should I use?"

This resource from UCLA is , essentially, a decision making tree for determining what kind of statistical analysis is appropriate based upon your data (see below). Screen shot from "What statistical analysis should I use?" Now, such decision making trees are available in many statistics text book...however... what makes this special is the fact that with each test comes code/syntax as well as output for SAS, Stata, SPSS, and R. Which is helpful to our students (and, let's be honest, us instructors/researchers as well).