Skip to main content

Explaining chi-square is easier when your observed data equals 100 (here, the US Senate)

UPDATE: 2020 Data:







https://www.catalyst.org/knowledge/women-government


When I explain chi-square at a conceptual, no-software, no-formula level, I use the example of gender distribution within the US Senate. There are 100 Senators, so the raw observed data count is the same as the observed data expressed via proportions. I think it makes it easier for junior statisticians to wrap their brains around chi-square. 

I  usually start with an Goodness-of-Fit (or, as I like to call them, "One-sies chi-squares").For this example, I divide senators into two groups: men and women. And what do you get? 

For the 115th Congress, there are 23 women and 77 men. There is your observed data, both as a raw count or as a proportion. What is your expected data? A 50/50 breakdown...which would also be 50 men and 50 women. Without doing the actual analysis, it is pretty safe to assume that, due to the great difference between expected and observed values, your chi-square Goodness of Fit would be significant.

You can this scaffold off of this example to explain the chi-square test of independence ("Two-sies Chi-Square). Still using Senate data, look at Gender x Political Party. Finally, you can use US House of Representatives data the same way, only this time you will challenge your students to think about proportions of men and women out of the total number of 435 Representatives. 

Helpful Links:

The Center for American Women in Politics, not surprisingly, track this data, for both the House and Senate AND divide the data up by gender. NOTE: You will have to update this slide very two years!

This story from FiveThirtyEight also has a few graphs breaking down gender for candidates for office in 2018.

Comments

Popular posts from this blog

Ways to use funny meme scales in your stats classes

Have you ever heard of the theory that there are multiple people worldwide thinking about the same novel thing at the same time? It is the multiple discovery hypothesis of invention . Like, multiple great minds around the world were working on calculus at the same time. Well, I think a bunch of super-duper psychology professors were all thinking about scale memes and pedagogy at the same time. Clearly, this is just as impressive as calculus. Who were some of these great minds? 1) Dr.  Molly Metz maintains a curated list of hilarious "How you doing?" scales.  2) Dr. Esther Lindenström posted about using these scales as student check-ins. 3) I was working on a blog post about using such scales to teach the basics of variables.  So, I decided to create a post about three ways to use these scales in your stats classes:  1) Teaching the basics of variables. 2) Nominal vs. ordinal scales.  3) Daily check-in with your students.  1. Teach your students the basics...

Leo DiCaprio Romantic Age Gap Data: UPDATE

Does anyone else teach correlation and regression together at the end of the semester? Here is a treat for you: Updated data on Leonardo DiCaprio, his age, and his romantic partner's age when they started dating. A few years ago, there was a dust-up when a clever Redditor r/TrustLittleBrother realized that DiCaprio had never dated anyone over 25. I blogged about this when it happened. But the old data was from 2022. Inspired by this sleuthing,  I created a wee data set, including up-to-date information on his current relationship with Vittoria Ceretti, so your students can suss out the patterns that exist in this data.

If your students get the joke, they get statistics.

Gleaned from multiple sources (FB, Pinterest, Twitter, none of these belong to me, etc.). Remember, if your students can explain why a stats funny is funny, they are demonstrating statistical knowledge. I like to ask students to explain the humor in such examples for extra credit points (see below for an example from my FA14 final exam). Using xkcd.com for bonus points/assessing if students understand that correlation =/= causation What are the numerical thresholds for probability?  How does this refer to alpha? What type of error is being described, Type I or Type II? What measure of central tendency is being described? Dilbert: http://search.dilbert.com/comic/Kill%20Anyone Sampling, CLT http://foulmouthedbaker.com/2013/10/03/graphs-belong-on-cakes/ Because control vs. sample, standard deviations, normal curves. Also,"skewed" pun. If you go to the original website , the story behind this cakes has to do w...