Not awful and boring ideas for teaching statistics

Posts

Showing posts with the label reliability

A rank ordering of the Taylor Swift songbook.

File under: End of the semester stress blogging about a person who brings me joy. Taylor Swift (see: sampling error with Taylor ). Here is a new, VERY accessible example of ordinal data . Rob Sheffield, writing for Rolling Stone, rank-ordered ALL of Dr. Swift's songs. https://www.rollingstone.com/music/music-lists/taylor-swift-songs-ranked-rob-sheffield-201800/bad-blood-2014-196114/ Also, introduce your students to Methods Section 😁. This rank order is based on the variable "Taylor genius". You could even use this as an example of anti-interrater reliability. This ranking comes from exactly one person. AND YOU'RE ON YOUR OWN KID DESERVED BETTER. Each ranking includes the best lyric from the song as well as a brief description of the Taylor Genius on display. Is this also an example of qualitative data? https://www.rollingstone.com/music/music-lists/taylor-swift-songs-ranked-rob-sheffield-201800/the-great-war-2022-1234617639/

NYT American dialect quiz as an example of validity and reliability.

TL:DR: Ameri-centric teaching example ahead: Have your students take this quiz, and the internet will tell them which regions of the US talk the same as them. Use it to teach validity. Longer Version: The NYT created a gorgeous version ( https://www.nytimes.com/interactive/2014/upshot/dialect-quiz-map.html ) of a previously available quiz ( http://www.tekstlab.uio.no/cambridge_survey/ ) that tells the user what version of American English they speak. The prediction is based upon loads and loads of survey data that studies how we talk. It takes you through 25 questions that ask you how you pronounce certain words and which regional words you use to describe certain things. Here are my results: Indeed, I spent elementary school in Northern Virginia, my adolescence in rural Central PA, college at PSU, and I now live in the far NW corner of PA. As this test indeed picked up on where I've lived and talked, I would say that this is a valid test based just on my u...

Stromberg and Caswell's "Why the Myers-Briggs test is totally meaningless"

Oh, Myers-Briggs Type Indicator, you unkillable scamp. This video , from Vox, gives a concise historical perspective on the scale, describes how popular it still is, and summarizes several of the arguments against the scale. This video explains why the ol' MBTI is not particularly useful. Good for debunking psychology myths and good for explaining reliability (in particular, test-retest reliability) and validity. I like this link in particular because it presents its argument via both video as well as a smartly formatted website. The text in the website includes links to actual peer-reviewed research articles that refute the MBTI.

Chris Taylor's "No, there's nothing wrong with your Fitbit"

Taylor, writing for Mashable , describes what happens when carefully conducted public health research (published in the Journal of the American Medical Association ) becomes attention grabbing and poorly represented click bait. Data published in JAMA (Case, Burwick, Volpp, & Patel, 2015) tested the step-counting reliability of various wearable fitness tracking devices and smart phone apps (see the data below). In addition to checking the reliability of various devices, the article makes an argument that, from a public health perspective, lots of people have smart phones but not nearly as many people have fitness trackers. So, a way to encourage wellness may be to encourage people to use the the fitness capacities within their smart phone (easier and cheaper than buying a fitness tracker). The authors never argue that fitness trackers are bad, just that 1) some are more reliable than others and 2) the easiest way to get people to engage in more mindful walking...

Research Wahlberg

" Mark Wahlberg as Research Scholar. Boom." Follow on Facebook or at twitter via @ ResearchMark

Five Lab's Big Five Personality Predictor

Five.com created an app to predict you score on the Big Five by analyzing your FB status updates. five.com's prediction via status update It might be fun to have students use this app to measure their Big Five and then compare those findings to the youarewhatyoulike.com app ( which I previously discussed on this blog ), which predicts your scores on the Big Five based on what you "Like" on FB. youarewhatyoulike.com's prediction via "Likes" As you can see, my "Likes" indicate that I am calm and relaxed but I am a neurotic status updater (crap...I'm that guy!). By contrasting the two, you could discuss reliability, validity, how such results are affected by social desirability, etc. Furthermore, you could also have your students take the original scale and see how it stacks up to the two FB measures. Note: If you ask your students to do this, they will have to give these apps access to a bunch of their personal informat...

NPR's "Will Afghan polling data help alleviate election fraud?"

This story details the application of American election polling techniques to Afghanistan's fledgling democracy. Essentially, international groups are attempting to poll Afghans prior to their April 2014 presidential elections as to combat voter fraud and raise awareness about the election. However, how do researchers go about collecting data in a country where few people have telephones, many people are illiterate, and just about everyone is weary about strangers approaching them and asking them sensitive questions about their political opinions? The story also touches on issues of social desirability as well as the decisions a researcher makes regarding the kinds of response options to use in survey research. I think that this would be a good story to share with a cranky undergraduate research methods class that thinks that collecting data from the undergraduate convenience sample is really, really hard. Less snarkily, this may be useful when teaching multiculturalism or ...

Time's "Can Time predict your politics?" by Jonathan Haidt and Chris Wilson

This scale , created by Haidt and Wilson, predicts your political leanings based upon seemingly unrelated questions. Screen grab from time.com You can use this in a classroom to 1) demonstrate interactive, Likert-type scales, 2) face validity (or lack there of). I think this would be 3) useful for a psychometrics class to discuss scale building. Finally, the update at the end of the article mentions 4) both the n-size and the correlation coefficient for their reliability study, allowing you discuss those concepts with students. For more about this research, try yourmorals.org

Washington Posts's "GAO says there is no evidence that a TSA program to spot terrorists is effective" (Update: 3/25/15)

The Travel Security Agency implemented SPOT training in order to teach air port security employees how to spot problematic and potentially dangerous individuals via behavioral cues. This intervention has cost the U.S. government $1 billion+. It doesn't seem to work. By discussing this with your class, you can discuss the importance of program evaluations as well as validity and reliability. The actual government issued report goes into great detail about how the program evaluation data was collected to demonstrate that SPOT isn't working. The findings (especially the table and figure below) do a nice job of demonstrating the lack of reliability and the lack of validity. This whole story also implicitly demonstrates that the federal government is hiring statisticians with strong research methods backgrounds to conduct program evaluations (= jobs for students). Here is a summary of the report from the Washington Post. Here is a short summary and video about the report from ...