Skip to main content

Posts

Ace's science fair project about Tom Brady: How to use as a class warm-up exercise

Stick with me here. I think this would be a great warm-up activity early in the semester. My boy Ace had a research hypothesis, operationalized his research, tried to collect data points using several test subjects, and measured his outcomes. Here is the original interview from  Draft Diamonds  and  Newsweek's story . 1) How did he operationalize his hypothesis? What was his IV? DV? 2) Did he use proper APA headers? Should APA style require the publication of pictures of crying researchers if their findings don't replicate? 3) This data could be analyzed using a repeated measure ANOVA. He had various members of his family throw a football as different PSIs and he measured how far the ball traveled and calculated mean for three attempts at each PSI. 4) His only participants were his mom, dad, and sister. So, this study is probably underpowered. 5) In this video from NBC news , Ace's dad describes how they came up with the research idea. Ace i...

Natural graph created by the sun, a magnifying glass, and a tree.

Someone on Reddit posted this cool picture of a...contraption? I'll go with contraption. Anyway, it automatically generates a chart of the amount of sunlight per day by burning a log. A Twitter follower recognized this as a Campbell-Stokes recorder . This is beautiful art and data visualization from Hood-Glen Park in San Francisco. How to use in class: 1) Make a bunch of really dumb log arithm jokes. 2) A nice introduction to data visualization. Maybe this could be paired with more traditional sources of weather data. 3) Also makes me think of other naturally occurring charts: Also, while less pretty, think about all the data that is automatically created every time Google Maps identifies your location (and then warns everyone using Google Maps to avoid traffic slowdowns) or Netflix provides you with recommendations based on viewing habits. The Campbell-Stokes recorder could serve as a metaphorical segue into a discussion about all the automated data collectio...

Daily Cycles in Twitter Content: Psychometric Indicators

Here is a YouTube video that summarizes some research findings . The researchers looked at Tweets in order to study how are focus and emotions change with our sleep/wake cycles. And the findings are interesting and not terribly surprising. Folks are mellow and rational in the morning and contemplate their mortality at 2 AM. Make money, get paid. And THIS is why I go to bed by 9 AM. I don't need to think about death at 2:20 AM. How to use in class: 1) Archival data (via Tweet) to explore human emotion. 2) What are the shortcomings of this sample method. To be sure, their data set is ENORMOUS, but how are Twitter users different from other people? Do your students think these findings would hold for people who work the night shift? 3) Go back to the original paper and look more closely at the findings: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0197002 4) This data represents one of the ways that researchers collect real-time information ...

Aschwanden's "Why We Still Don’t Know How Many NFL Players Have CTE"

This story by Christine Aschwanden  from 538.com  describes the limitations of a JAMA article.   That JAMA article describes a research project that found signs of Chronic Traumatic Encephalopathy (CTE) in 110 out of 111 brains of former football players. How to use in stats and research methods: 1) It is research, y'all. 2) One of the big limitations of this paper comes from sampling. 3) The 538 article includes a number of thought experiments that grapple with the sampling distribution for all possible football players. 4) Possible measurement errors in CTE detection. 5) Discussion of replication using a longitudinal design and a control group. The research: The JAMA article details a study of 111 brains donated by the families deceased football players. They found evidence of CTE in 110 of the brains. Which sounds terrifying if you are a current football player, right? But does this actually mean that 110 out of 111 football players will develop CTE...

The Novice Professors' "Teaching statistical methods mostly formula free"

Nothing freaks out your students faster than a formula, right? Karly over at The Novice Professor shares some worksheets she created for her students to step them through a few of the most common Intro Stats formulas: standard deviation, z-scores, and correlation.  http://www.thenoviceprofessor.com/blog/teaching-statistical-methods-mostly-formula-free Reasons to use in class: 1) Statistics has its own anxiety scale. I think a lot of that anxiety comes from the math part of a stats scale. These hand outs allow you to introduce the math and formulas without ever using the math and formulas. 2) I am a big fan of introducing statistics conceptually then getting into the nitty gritty of calculation, interpretation of output, etc. I like the formula-free approach here in order to introduce the idea of what frequently used stats, like SD, are really doing.

BBC's News' "Who is your Olympic Body Match?"

This interactive website from the BBC will match your student, using their height, gender, and weight, to their Rio Olympic body match. You enter your height, weight, age, and select your gender. It matches you with the athlete who is the most like you. It also provides good examples for distribution, and where you fall on the distribution, for Olympic athletes. I think it also gets students thinking about regression models. After you enter your data, the page returns information about where you fall on the distribution histogram for Olympic athletes by height, weight, and age for your gender. Then, the website returns your topic matches: How to use in class: 1) What other IVs could you collect to determine best sport match (DV)? Family income (I had access to soccer growing up, but not dressage horses)? Average temperature of hometown (My high school had a skiing club but not a beach volleyball club)? This gets your students thinking about multiple regression ...

A bunch of pediatricians swallowed Lego heads. You can use their research to teach the basics of research methods and stats.

As a research-parent-nerd joke before Christmas, six doctors swallowed Lego heads and recorded how long it took to pass the Lego heads. Why? As to inform parents about the lack of danger associated with your kid swallowing a tiny toy.  I encourage you to use it as a class example because it is short, it describes its research methodology very clearly, using a within-subject design, has a couple of means, standard deviations, and even a correlation. TL;DR: https://dontforgetthebubbles.com/dont-forget-the-lego/ In greater detail: Note the use of a within subject design. They also operationalized their DV via the SHAT (Stool Hardness and Transit) scale. *Yeah. So here is the Bristol Stool Chart  mentioned in the above excerpt. Please don't click on the link if your are eating or have a sensitive stomach. Research outcomes, including mean and standard deviations: An example of a non-significant correlation, with the SHAT score on the y-axi...

Naro's "Why can't anyone replicate the scientific studies from those eye-grabbing headlines?"

Maki Naro created a terrific comic strip detailing the replication, where it came from, where we are, and possible solutions.  You can use it in class to introduce the crisis and solutions. I particularly enjoy the overall tone: Hope is not lost. This is a time of change in statistics and methodology that will ultimately make science better. A few highlights: *History of science, including the very first research journal (and why the pressure to get published has lead to bad science) *Illustration of some statsy ways to bend the truth in science  *References big moments in the Replication Crisis  *Discusses the crisis AND solutions (PLOS, SIPS, COS)

Coolness Graphed by RC Jones

They are bar graphs. And they are funny.

Yule Log(arithm) Alternative (Hypothesis) Presentations

My friends. For you, I have compiled all of my funny statistic-Christmas images into one Google Slide presentation. You are welcome:  https://docs.google.com/presentation/d/12nmMw-69Ez71VmzaZ_QbUx4NOXiP17LHnvDjWUrx094/edit?usp=sharing

NBC News' "This algorithm helps catch serial killers"

I don't find many examples of cluster analysis to share, but this example is REALLY engaging (using data to find serial killers), and is simple enough for a baby statistician BUT you can also make it a more advanced lesson as the data's owners freely share their data and code. Short Version: Journalist Thomas Hargrove (and his team) used cluster analysis to find clusters of similar killings within geographic areas. These might be a sign that a serial killer is active in that geographic region. It correctly identified a killer in Indiana. I found this interview from datainnovation.org which most succinctly describes the data analysis: https://www.datainnovation.org/2017/07/5-qs-for-thomas-hargrove-founder-of-the-murder-accountability-project/ Also statsy because the cluster analysis was validated using data from known serial killers. Hargrove's data and code can be accessed  here  and more information on his overall project to solve murders can be found...

Explaining chi-square is easier when your observed data equals 100 (here, the US Senate)

UPDATE: 2020 Data: https://www.catalyst.org/knowledge/women-government When I explain chi-square at a conceptual, no-software, no-formula level, I use the example of gender distribution within the US Senate. There are 100 Senators, so the raw observed data count is the same as the observed data expressed via proportions. I think it makes it easier for junior statisticians to wrap their brains around chi-square.  I  usually start with an Goodness-of-Fit (or, as I like to call them, "One-sies chi-squares").For this example, I divide senators into two groups: men and women. And what do you get?  For the 115th Congress, there are 23 women and 77 men . There is your observed data, both as a raw count or as a proportion. What is your expected data? A 50/50 breakdown...which would also be 50 men and 50 women. Without doing the actual analysis, it is pretty safe to assume that, due to the great difference between expected and observed values, your chi-square Goodness o...

A lesson in lying with statistics, as taught by Chrissy Teigen.

We already knew that model/cookbook author  Chrissy Teigen is really good at Twitter. We recently learned that, delightfully, she is also good at spotting misrepresented statistics. This came to light when she asked for help understanding the whole Jacob Wohl Debacle . She asked her Twitter followers for a clear, quick explanation of the whole deal. She didn't even @ Jacob, but Jacob got snippy and replied back with Google Trends data (how have I not blogged about Google Trends yet?) in an attempt to use data, beautiful data, in order to own Chrissy. And Chrissy was having none of it.  Yes, her sweet burn is an inspiration to us all, but it also a good demonstration of that fact that the exact same data can be interpreted in two different ways. And jerks lie with data, too, and can lie with actual, truthful data. And Chrissy knows her way around a chart.