Not awful and boring ideas for teaching statistics

Posts

Showing posts from September, 2015

Aschwanden's "Science is broken, it is just a hell of a lot harder than we give it credit for"

Aschwanden (for fivethirtyeight.com) did an extensive piece that summarizes that data/p-hacking/what's wrong with statistical significance crisis in statistics. There is a focus on the social sciences, including some quotes from Brian Nosek regarding his replication work. The report also draws attention to Retraction Watch and Center for Open Science as well as retractions of findings (as an indicator of fraud and data misuse). The article also describes our funny bias of sticking to early, big research findings even after those research findings are disproved (example used here is the breakfast eating:weight loss relationship). The whole article could be used for a statistics or research methods class. I do think that the p-hacking interactive tool found in this report could be especially useful illustration of How to Lie with Statistics. The "Hack your way to scientific glory" interactive piece demonstrates that if you fool around enough with your operationalized...

Correlation example using research study about reusable shopping bags/shopping habits

A few weeks ago, I used an NPR story in order to create an ANOVA example for use in class. This week, I'm giving the same treatment to a different research study discussed on NPR and turning it into a correlation example. A recent research study found that individuals who use reusable grocery store bags tend to spend more on both organic food AND junk food. Here is NPR's treatment of the research . Here is a more detailed account of the research via an interview with one of the study's authors. Here is the working paper that the PIs have released for even more detail. The researchers frame their findings (folks who are "good" by using resuable bags and purchasing organic food then feel entitled to indulge in some chips and cookies) via "licensing", but I think this could also be explained by ego depletion (opening up a discussion about that topic). So, I created a little faux data set that replicates the main finding: Folks who use reusable ...

Mersereau's "Wunderground Uses Fox News Graphing Technique to Boast Forecast Skills"

Mersereau, writing for Gawker website The Vane, provides another example of How Not To Graph. Or How To Graph As To Not Lie About Data But Make Your Data Look More Impressive Than Is Ethical. Weather Underground (AKA Wunderground, weather forecasting service/website) was bragging about it's accuracy compared to the competition. At first glance (see below), this graph seems to reinforce the argument...until you take a look at the scale being used. The beginning point on the X axis is 70, while the high point is 80. So, really, the differences listed probably don't even approach statistical significance. This story, somewhat randomly, also includes some shady graphs created by Fox News. I don't understand the need for the extra Fox News graphs, but they also illustrate how one can create graphs that have accurate numbers but still manage to twist the truth.

Dayna Evans "Do You Live in a "B@%$#" or a "F*%&" State? American Curses, Mapped"

Warning: This research and story include every paint-peeling obscenity in the book. Caution should be used when opening up these links on your work computer and you should really think long an hard before providing these links to your students. However, the research I'm about to describe 1) illustrates z-scores and 2) investigated regional usage of safe-for-the-classroom words like darn, damn, and gosh. So, a linguist, Dr. Jack Grieve decided to use Twitter data to map out the use of different obscenities by county of the United States. Gawker picked up on this research and created a story about it . How can this be used in a statistics class? In order to quantify greater or lesser use of different obscenities, he created z-scores by county and illustrated the difference via a color-coding system. The more orange, the higher the z-score for a region (thus, greater usage) while blue indicates lesser usage. And, there are three such maps (damn, darn, and gosh) that are safe for us...