Saturday, June 29, 2019

Do Americans spend $18K/year on non-essentials?

This is a fine example of using misleading statistics to try and make an argument.

USA Today tweeted out this graphic, related to some data that was collected by some firm.

There appear to be a number of method issues with this data, so a number of ways to use this in your class:

1) False Dichotomy: Survey response options should be mutually exclusive. I think there are two types of muddled dichotomies with this data:

a) What is "essential"?

When my kids were younger, I had an online subscription for diapers. Those were absolutely essential and I received a discount on my order since it was a subscription. However, according to this survey dichotomy, are they an indulgence since they were a subscription that originated online.

b) Many purchases fall into multiple categories.

Did the survey creators "double-dip" as to pad each mean and push the data towards it's $18K conclusion?

Were participants clear that "drinks out with friends" and "eating out at restaurants" were two discreet categories? What category applies if I impulsively buy a curling iron (for personal grooming) online?  

2) Data from an established, well-known news source is not perfect data.

This data went viral. Lots of people were exposed to this data, as it was linked by local news channels and local newspapers. But that doesn't make it perfect data.

3) The data assumes that all Americans use all of the products in all of these categories.

Plenty of people don't belong to a gym or use ride-sharing services, but these expenses still count towards the total for all Americans.

4) Conflict of interest.

The original data was collected in order to make an argument in favor of buying life insurance. Specifically, they were arguing that individuals could afford life insurance if they better budgeted, which is true for some people. However, it is problematic to frame certain expenses as an option when they are not.

5) If a person didn't use one of these services, where their "zeroes" counted towards the mean?

Monday, June 24, 2019

Pew Research's "Gender and Jobs in Online Image Searches"

You know how every few months, someone Tweets about stock photos that are generated when you Google "professor"? And those photos mainly depict white dudes? See below. Say "hi" to Former President and former law school professor Obama, coming it at #10, several slots after "novelty kid professor in lab coat".

Well, Pew Research decided to quantify this perennial Tweet, and expand it far beyond academia. They used Machine Learning to search through over 10K images depicting 105 occupations and test whether or not the images showed gender bias. 

How you can use this research in your RM class:

1. There are multiple ways to quantify and operationalize your variables. There are different ways to measure phenomena. If you read through the report, you will learn that Pew both a) compared actual gender ratios to the gender ratios they found in the pictures and b) counted how long it took until a search result returned the picture of a woman for a given job.
Quantifying difference by the sheer number of images.
Quantifying the difference by counting how long it takes to find a picture of a woman doing the job.

2. Replication outside of America: This research didn't just look at America but at 18 different countries.

3. Machine learning in research. For more detail on how they learned the machine to identify gender, see their methodology page.

4. Pew used data from the federal government for this project. The Bureau of Labor Statistics provided all of the actual gender break-down data for occupations.

As always, Pew provides the full report.

Wednesday, June 19, 2019

The Evolution of Pew Research Center’s Survey Questions About the Origins and Development of Life on Earth

Question-wording matters, friends! This example shows how question order and question-wording can affect participant response. This is a good example for all of your research methods and psychometrics students to chew on.

Pew Research asked people if they believed in evolution. They did so in three different ways, which lead to three different response patterns.
1) Prior to asking about evolution, the asked whether or not the participant believes in God.
2) Asked participants if they believed in evolution. If they said "yes", they asked the participant whether or not they believe that a higher power guides evolution.
3) They asked participants if they believed in evolution and gave participants three response options:
    a) Don't believe in evolution.
    b) Believe in evolution due to natural selection.
    c) Believe in evolution guided by a higher power.

Responses to Option 1:

Responses to Options 2. and 3.

Oh, the classroom discussions that could come out of this example!

Damn you, auto-correct: Statistics edition

Legit funny, but also a gentle way to remind our students that Word will not flag a correctly spelled word that is not the word you want.

Saturday, June 8, 2019

Alison Horst: Brilliant data illustrations

As I write this, I am a parent on the first day of summer break, and I have two kids who are very different from one another. So, these hilarious examples of Type I/II error from Alison Horst really speaks to me. 

Not only are these illustrations beautiful and funny, but I think they really get your students to think about one HUGE underlying issue in all of inferential statistics: Every little sample that we analyze is just one of near-infinite possible samples that could have been drawn from the underlying population (or, the sampling distribution of the sample mean).

Head over to her GitHub for a funny, normal curve illustration and higher resolution versions of the above pictures. She also has numerous beautiful R and ggplot illustrations.

UPDATE: 11/6/19

Alison made some super cute illustrations for a topic that is simultaneously very boring but also tricky for baby statisticians: Scales of measurement.

Monday, June 3, 2019

NYT's "Steven Curry has a popcorn problem"

1) I disagree with Marc Stein's title for this article. I don't think NBA great Steven Curry's devotion to his favorite snack is a problem. I think it is a very, very endearing example of someone who knows themselves, knows what works for them, and embraces it. A quote from the article describing Curry's popcorn devotion:

2) Curry loves popcorn so much that at the behest of the New York Times, Curry rated popcorn served at all of the pro-basketball arenas:

Here is an example of the assessment form:

 And here are the results of the NYT's n=1 study. In addition to a statistics class example, I think this could also be used in an I/O class to explain Subject Matter Experts ;)