Skip to main content

Posts

Yule Log(arithm) Alternative (Hypothesis) Presentations

My friends. For you, I have compiled all of my funny statistic-Christmas images into one Google Slide presentation. You are welcome:  https://docs.google.com/presentation/d/12nmMw-69Ez71VmzaZ_QbUx4NOXiP17LHnvDjWUrx094/edit?usp=sharing

NBC News' "This algorithm helps catch serial killers"

I don't find many examples of cluster analysis to share, but this example is REALLY engaging (using data to find serial killers), and is simple enough for a baby statistician BUT you can also make it a more advanced lesson as the data's owners freely share their data and code. Short Version: Journalist Thomas Hargrove (and his team) used cluster analysis to find clusters of similar killings within geographic areas. These might be a sign that a serial killer is active in that geographic region. It correctly identified a killer in Indiana. I found this interview from datainnovation.org which most succinctly describes the data analysis: https://www.datainnovation.org/2017/07/5-qs-for-thomas-hargrove-founder-of-the-murder-accountability-project/ Also statsy because the cluster analysis was validated using data from known serial killers. Hargrove's data and code can be accessed  here  and more information on his overall project to solve murders can be found...

Explaining chi-square is easier when your observed data equals 100 (here, the US Senate)

UPDATE: 2020 Data: https://www.catalyst.org/knowledge/women-government When I explain chi-square at a conceptual, no-software, no-formula level, I use the example of gender distribution within the US Senate. There are 100 Senators, so the raw observed data count is the same as the observed data expressed via proportions. I think it makes it easier for junior statisticians to wrap their brains around chi-square.  I  usually start with an Goodness-of-Fit (or, as I like to call them, "One-sies chi-squares").For this example, I divide senators into two groups: men and women. And what do you get?  For the 115th Congress, there are 23 women and 77 men . There is your observed data, both as a raw count or as a proportion. What is your expected data? A 50/50 breakdown...which would also be 50 men and 50 women. Without doing the actual analysis, it is pretty safe to assume that, due to the great difference between expected and observed values, your chi-square Goodness o...

A lesson in lying with statistics, as taught by Chrissy Teigen.

We already knew that model/cookbook author  Chrissy Teigen is really good at Twitter. We recently learned that, delightfully, she is also good at spotting misrepresented statistics. This came to light when she asked for help understanding the whole Jacob Wohl Debacle . She asked her Twitter followers for a clear, quick explanation of the whole deal. She didn't even @ Jacob, but Jacob got snippy and replied back with Google Trends data (how have I not blogged about Google Trends yet?) in an attempt to use data, beautiful data, in order to own Chrissy. And Chrissy was having none of it.  Yes, her sweet burn is an inspiration to us all, but it also a good demonstration of that fact that the exact same data can be interpreted in two different ways. And jerks lie with data, too, and can lie with actual, truthful data. And Chrissy knows her way around a chart.

Free beer (data)!

I am absolutely NOT above pandering to undergraduates. For example, I use beer-related examples to illustrate t-test s,   correlation/regression , curvilinear relationships , and data mining/re-purposing . Here is some more. This data was collected to estimate how much more participants would pay for their beer if their beer was created in an environmentally sustainable manner. The answer? $1.30/six pack more. And 59% of respondents said that they would pay more for sustainable beer. NPR talked about it , as well as ways that breweries are going green. Here is a link to the original research . How to use in class: 1) The original research is shared via an open source journal . So, an opportunity to talk about open source research journals. 2) They data was collected via mTurk, another ancillary topics to discuss with your budding research methodologists. 3) The authors of the original study shared their beer survey data ! Analyze to your heart's content. 4) How c...

The Waffle House Index, a great example for creative measurement methods.

Alright, this example is a little more abstract, but stick with me. When you perform statistics, you are measuring or counting something. And sometimes the thing you want to measure is pretty straightforward. The number of sick days an employee takes. GPA. Parts per million of some thingy in the water. But sometimes statisticians, especially psychologists, have to get a little creative and indirect with the way we measure a thing. Like the MMPI. IQ tests are our best bet at encompassing someone's intelligence but are still not perfect. Sometimes, a statistician needs to find an approximation or proxy for the actual thing they are measuring. To explain this, show your students how the Federal Emergency Management Agency uses the Waffle House Index to determine how severely damaged a town is following a hurricane or tornado. http://www2.philly.com/philly/news/weather/hurricane-florence-waffle-house-index-20180912.html If you are one of the uninitiated, Waffle Houses are...

The Knot's Real Wedding Study 2017

The Knot, a wedding planning website, collected data on the amount of money that brides and grooms spend on items for their weddings. They shared this information, as well as the average cost of a wedding in 2017. See the infographic below: BUT WAIT! If you dig into this data and the methodology, you'll find out that they only collected price points from couples who ACTUALLY PAID FOR THOSE ITEMS. https://xogroupinc.com/press-releases/the-knot-2017-real-weddings-study-wedding-spend/ Problems with this data to discuss with your students: 1) No one who got stuff for free/traded for stuff would have their $0 counted towards the average. For example, one of my cousins is a tattoo artist and he traded tattoos for use of a drone for photos of their outdoor wedding. 2) AND...if you didn't USE a service, your $0 wasn't added to their ol' mean value. For example, we had our wedding and reception at the same location, so we spent $0 on a ceremony site. 3) As poi...

Two great websites that generate data sets for teaching.

You could also use these websites to generate totally unethical data for publication. Don't do it, buddy. Sometimes, it is lovely to have some data generated to teach your stats class when you are teaching. You know the data for a particular statistical test and the results. Here are two websites that do just that. One tried and trustworthy resource was created by   I/O psychologist Richard Landers.  I  blogged about this one  in 2013, and I've used his data generator for years. My new resource is from social psychologist Andrew Luttrell. Nice things about both: -Data! -Both are easy to use. -Specific data for everything you teach in Intro Stats, like t-tests, ANOVA, correlation, and regression. -They are both free and help you do your job. Thanks, Richard and Andrew! The nice thing about Richard's is that it gives you options of several different units (days, money, etc.) AND vignettes that explain why this data was collected. You can generate data ...

Great Tweets about Statistics

I've shared these on my Twitter feed, and in a previous blog post dedicated to stats funnies. However,  I decided it would be useful to have a dedicated, occasionally updated blog post devoted to Twitter Statistics Comedy Gold. How to use in class? If your students get the joke, they get a stats concept. *Aside: I know I could have embedded these Tweets, but I decided to make my life easier by using screenshots. How NOT to write a response option.  Real life inter-rater reliability Scale Development Alright, technically not Twitter, but I am thrilled to make an exception for this clever, clever costume: This whole thread is awesome...https://twitter.com/EmpiricalDave/status/1067941351478710272 Randomness is tricky! And not random! ...