Explaining chi-square is easier when your observed data equals 100 (here, the US Senate)

UPDATE: 2020 Data:







https://www.catalyst.org/knowledge/women-government


When I explain chi-square at a conceptual, no-software, no-formula level, I use the example of gender distribution within the US Senate. There are 100 Senators, so the raw observed data count is the same as the observed data expressed via proportions. I think it makes it easier for junior statisticians to wrap their brains around chi-square. 

I  usually start with an Goodness-of-Fit (or, as I like to call them, "One-sies chi-squares").For this example, I divide senators into two groups: men and women. And what do you get? 

For the 115th Congress, there are 23 women and 77 men. There is your observed data, both as a raw count or as a proportion. What is your expected data? A 50/50 breakdown...which would also be 50 men and 50 women. Without doing the actual analysis, it is pretty safe to assume that, due to the great difference between expected and observed values, your chi-square Goodness of Fit would be significant.

You can this scaffold off of this example to explain the chi-square test of independence ("Two-sies Chi-Square). Still using Senate data, look at Gender x Political Party. Finally, you can use US House of Representatives data the same way, only this time you will challenge your students to think about proportions of men and women out of the total number of 435 Representatives. 

Helpful Links:

The Center for American Women in Politics, not surprisingly, track this data, for both the House and Senate AND divide the data up by gender. NOTE: You will have to update this slide very two years!

This story from FiveThirtyEight also has a few graphs breaking down gender for candidates for office in 2018.

Comments