Skip to main content

Diversity in Tech by DataIsBeautiful

I am a fan of explaining the heart of a statistical analysis conceptually with words and examples, not with math. Information Is Beautiful has a gorgeous new interactive, Diversity in Tech, that uses data visualization to present gender and ethnic representation among employees at various big-name internet firms.

I think this example explains why we might use Chi-Square Goodness of Fit. I think it could also be used in an I-O class.

So, what this interactive gives you is a list of the main, big online firms. And then the proportions of different sort of people who fall into each category. See below:



When I look at that US Population baseline information, I see a bunch of expected data. And then when I see the data for different firms, I see Observed data. So, I see a bunch of conceptual examples for chi-square Goodness of Fit.

For example, look at gender. 51% of the population is female. That is you Expected data. Compare that to data for Indiegogo. They have 50% female employees. That is your Observed data. Eyeballing it, you can guess that wouldn't be a significant chi-square. The two distributions are very similar. Again, this is a CONCEPTUAL introduction to what chi-square looks at, not a computation one.





Another important conceptual piece to chi-square is the fact that you need your O and E values to be pretty far apart in order to get a big test statistic. So, you also ask your students to compare: Which do you think would have a larger chi-square test statistic for Latino Employees: Amazon or Ebay? The Expected value is 18%. Since Amazon's 13% is closer to the Expected value of 18% than Ebay's 4%, we would expect Amazon to have a smaller X2 value than Ebay.


I bet you could also use this data while teaching I-O. One way to demonstrate fair hiring practices is to demonstrate that your workforce mimics the ethnic breakdown of America, or your state, or your region. As such, you could ask your students to pretend to be consultants and make recommendations for which firms have the biggest disparities, using this data as evidence.

Comments

Popular posts from this blog

Ways to use funny meme scales in your stats classes

Have you ever heard of the theory that there are multiple people worldwide thinking about the same novel thing at the same time? It is the multiple discovery hypothesis of invention . Like, multiple great minds around the world were working on calculus at the same time. Well, I think a bunch of super-duper psychology professors were all thinking about scale memes and pedagogy at the same time. Clearly, this is just as impressive as calculus. Who were some of these great minds? 1) Dr.  Molly Metz maintains a curated list of hilarious "How you doing?" scales.  2) Dr. Esther Lindenström posted about using these scales as student check-ins. 3) I was working on a blog post about using such scales to teach the basics of variables.  So, I decided to create a post about three ways to use these scales in your stats classes:  1) Teaching the basics of variables. 2) Nominal vs. ordinal scales.  3) Daily check-in with your students.  1. Teach your students the basics...

Using pulse rates to determine the scariest of scary movies

  The Science of Scare project, conducted by MoneySuperMarket.com, recorded heart rates in participants watching fifty horror movies to determine the scariest of scary movies. Below is a screenshot of the original variables and data for 12 of the 50 movies provided by MoneySuperMarket.com: https://www.moneysupermarket.com/broadband/features/science-of-scare/ https://www.moneysupermarket.com/broadband/features/science-of-scare/ Here is my version of the data in Excel format . It includes the original data plus four additional columns (so you can run more analyses on the data): -Year of Release -Rotten Tomato rating -Does this movie have a sequel (yes or no)? -Is this movie a sequel (yes or no)? Here are some ways you could use this in class: 1. Correlation : Rotten Tomato rating does not correlate with the overall scare score ( r = 0.13, p = 0.36).   2. Within-subject research design : Baseline, average, and maximum heart rates are reported for each film.   3. ...

Andy Field's Statistics Hell

Andy Field is a psychologist, statistician, and author. He created a funny, Dante's Inferno-themed  web site that contains everything you ever wanted to know about statistics. I know, I know, you're thinking, "Not another Dante's Inferno themed statistics web site!". But give this one a try. Property of Andy Field. I certainly can't take credit for this. Some highlights: 1) The aesthetic is priceless. For example, his intermediate statistics page begins with the introduction, "You will experience the bowel-evacuating effect of multiple regression, the bone-splintering power of ANOVA and the nose-hair pulling torment of factor analysis. Can you cope: I think not, mortal filth. Be warned, your brain will be placed in a jar of cerebral fluid and I will toy with it at my leisure." 2) It is all free. Including worksheets, data, etc. How amazing and generous. And, if you are feeling generous and feel the need to compensate him for the website, ...