Skip to main content

Posts

Showing posts from 2018

BBC's News' "Who is your Olympic Body Match?"

This interactive website from the BBC will match your student, using their height, gender, and weight, to their Rio Olympic body match. You enter your height, weight, age, and select your gender. It matches you with the athlete who is the most like you. It also provides good examples for distribution, and where you fall on the distribution, for Olympic athletes. I think it also gets students thinking about regression models. After you enter your data, the page returns information about where you fall on the distribution histogram for Olympic athletes by height, weight, and age for your gender. Then, the website returns your topic matches: How to use in class: 1) What other IVs could you collect to determine best sport match (DV)? Family income (I had access to soccer growing up, but not dressage horses)? Average temperature of hometown (My high school had a skiing club but not a beach volleyball club)? This gets your students thinking about multiple regression ...

A bunch of pediatricians swallowed Lego heads. You can use their research to teach the basics of research methods and stats.

As a research-parent-nerd joke before Christmas, six doctors swallowed Lego heads and recorded how long it took to pass the Lego heads. Why? As to inform parents about the lack of danger associated with your kid swallowing a tiny toy.  I encourage you to use it as a class example because it is short, it describes its research methodology very clearly, using a within-subject design, has a couple of means, standard deviations, and even a correlation. TL;DR: https://dontforgetthebubbles.com/dont-forget-the-lego/ In greater detail: Note the use of a within subject design. They also operationalized their DV via the SHAT (Stool Hardness and Transit) scale. *Yeah. So here is the Bristol Stool Chart  mentioned in the above excerpt. Please don't click on the link if your are eating or have a sensitive stomach. Research outcomes, including mean and standard deviations: An example of a non-significant correlation, with the SHAT score on the y-axi...

Naro's "Why can't anyone replicate the scientific studies from those eye-grabbing headlines?"

Maki Naro created a terrific comic strip detailing the replication, where it came from, where we are, and possible solutions.  You can use it in class to introduce the crisis and solutions. I particularly enjoy the overall tone: Hope is not lost. This is a time of change in statistics and methodology that will ultimately make science better. A few highlights: *History of science, including the very first research journal (and why the pressure to get published has lead to bad science) *Illustration of some statsy ways to bend the truth in science  *References big moments in the Replication Crisis  *Discusses the crisis AND solutions (PLOS, SIPS, COS)

Coolness Graphed by RC Jones

They are bar graphs. And they are funny.

Yule Log(arithm) Alternative (Hypothesis) Presentations

My friends. For you, I have compiled all of my funny statistic-Christmas images into one Google Slide presentation. You are welcome:  https://docs.google.com/presentation/d/12nmMw-69Ez71VmzaZ_QbUx4NOXiP17LHnvDjWUrx094/edit?usp=sharing

NBC News' "This algorithm helps catch serial killers"

I don't find many examples of cluster analysis to share, but this example is REALLY engaging (using data to find serial killers), and is simple enough for a baby statistician BUT you can also make it a more advanced lesson as the data's owners freely share their data and code. Short Version: Journalist Thomas Hargrove (and his team) used cluster analysis to find clusters of similar killings within geographic areas. These might be a sign that a serial killer is active in that geographic region. It correctly identified a killer in Indiana. I found this interview from datainnovation.org which most succinctly describes the data analysis: https://www.datainnovation.org/2017/07/5-qs-for-thomas-hargrove-founder-of-the-murder-accountability-project/ Also statsy because the cluster analysis was validated using data from known serial killers. Hargrove's data and code can be accessed  here  and more information on his overall project to solve murders can be found...

Explaining chi-square is easier when your observed data equals 100 (here, the US Senate)

UPDATE: 2020 Data: https://www.catalyst.org/knowledge/women-government When I explain chi-square at a conceptual, no-software, no-formula level, I use the example of gender distribution within the US Senate. There are 100 Senators, so the raw observed data count is the same as the observed data expressed via proportions. I think it makes it easier for junior statisticians to wrap their brains around chi-square.  I  usually start with an Goodness-of-Fit (or, as I like to call them, "One-sies chi-squares").For this example, I divide senators into two groups: men and women. And what do you get?  For the 115th Congress, there are 23 women and 77 men . There is your observed data, both as a raw count or as a proportion. What is your expected data? A 50/50 breakdown...which would also be 50 men and 50 women. Without doing the actual analysis, it is pretty safe to assume that, due to the great difference between expected and observed values, your chi-square Goodness o...

A lesson in lying with statistics, as taught by Chrissy Teigen.

We already knew that model/cookbook author  Chrissy Teigen is really good at Twitter. We recently learned that, delightfully, she is also good at spotting misrepresented statistics. This came to light when she asked for help understanding the whole Jacob Wohl Debacle . She asked her Twitter followers for a clear, quick explanation of the whole deal. She didn't even @ Jacob, but Jacob got snippy and replied back with Google Trends data (how have I not blogged about Google Trends yet?) in an attempt to use data, beautiful data, in order to own Chrissy. And Chrissy was having none of it.  Yes, her sweet burn is an inspiration to us all, but it also a good demonstration of that fact that the exact same data can be interpreted in two different ways. And jerks lie with data, too, and can lie with actual, truthful data. And Chrissy knows her way around a chart.

Free beer (data)!

I am absolutely NOT above pandering to undergraduates. For example, I use beer-related examples to illustrate t-test s,   correlation/regression , curvilinear relationships , and data mining/re-purposing . Here is some more. This data was collected to estimate how much more participants would pay for their beer if their beer was created in an environmentally sustainable manner. The answer? $1.30/six pack more. And 59% of respondents said that they would pay more for sustainable beer. NPR talked about it , as well as ways that breweries are going green. Here is a link to the original research . How to use in class: 1) The original research is shared via an open source journal . So, an opportunity to talk about open source research journals. 2) They data was collected via mTurk, another ancillary topics to discuss with your budding research methodologists. 3) The authors of the original study shared their beer survey data ! Analyze to your heart's content. 4) How c...

The Waffle House Index, a great example for creative measurement methods.

Alright, this example is a little more abstract, but stick with me. When you perform statistics, you are measuring or counting something. And sometimes the thing you want to measure is pretty straightforward. The number of sick days an employee takes. GPA. Parts per million of some thingy in the water. But sometimes statisticians, especially psychologists, have to get a little creative and indirect with the way we measure a thing. Like the MMPI. IQ tests are our best bet at encompassing someone's intelligence but are still not perfect. Sometimes, a statistician needs to find an approximation or proxy for the actual thing they are measuring. To explain this, show your students how the Federal Emergency Management Agency uses the Waffle House Index to determine how severely damaged a town is following a hurricane or tornado. http://www2.philly.com/philly/news/weather/hurricane-florence-waffle-house-index-20180912.html If you are one of the uninitiated, Waffle Houses are...

The Knot's Real Wedding Study 2017

The Knot, a wedding planning website, collected data on the amount of money that brides and grooms spend on items for their weddings. They shared this information, as well as the average cost of a wedding in 2017. See the infographic below: BUT WAIT! If you dig into this data and the methodology, you'll find out that they only collected price points from couples who ACTUALLY PAID FOR THOSE ITEMS. https://xogroupinc.com/press-releases/the-knot-2017-real-weddings-study-wedding-spend/ Problems with this data to discuss with your students: 1) No one who got stuff for free/traded for stuff would have their $0 counted towards the average. For example, one of my cousins is a tattoo artist and he traded tattoos for use of a drone for photos of their outdoor wedding. 2) AND...if you didn't USE a service, your $0 wasn't added to their ol' mean value. For example, we had our wedding and reception at the same location, so we spent $0 on a ceremony site. 3) As poi...

Two great websites that generate data sets for teaching.

You could also use these websites to generate totally unethical data for publication. Don't do it, buddy. Sometimes, it is lovely to have some data generated to teach your stats class when you are teaching. You know the data for a particular statistical test and the results. Here are two websites that do just that. One tried and trustworthy resource was created by   I/O psychologist Richard Landers.  I  blogged about this one  in 2013, and I've used his data generator for years. My new resource is from social psychologist Andrew Luttrell. Nice things about both: -Data! -Both are easy to use. -Specific data for everything you teach in Intro Stats, like t-tests, ANOVA, correlation, and regression. -They are both free and help you do your job. Thanks, Richard and Andrew! The nice thing about Richard's is that it gives you options of several different units (days, money, etc.) AND vignettes that explain why this data was collected. You can generate data ...

Great Tweets about Statistics

I've shared these on my Twitter feed, and in a previous blog post dedicated to stats funnies. However,  I decided it would be useful to have a dedicated, occasionally updated blog post devoted to Twitter Statistics Comedy Gold. How to use in class? If your students get the joke, they get a stats concept. *Aside: I know I could have embedded these Tweets, but I decided to make my life easier by using screenshots. How NOT to write a response option.  Real life inter-rater reliability Scale Development Alright, technically not Twitter, but I am thrilled to make an exception for this clever, clever costume: This whole thread is awesome...https://twitter.com/EmpiricalDave/status/1067941351478710272 Randomness is tricky! And not random! ...