Not awful and boring ideas for teaching statistics

Posts

Showing posts with the label content analysis

Smart's "The differences in how CNN MSNBC & FOX cover the news"

https://pudding.cool/2018/01/chyrons/ This example doesn't demonstrate a specific statistical test. Instead, it demonstrate how data can be used to answer a hotly contested question: Are certain media outlets biased? How can we answer this? Charlie Smart, working for The Pudding, addressed this question via content analysis. Here is how he did it: And here are some of their findings: Yes, Fox News was talking about the Clintons a lot. While over at MSNBC, they discussed the investigation into Russia and the 2016 elections ore frequently. While kneeling during the anthem was featured on all networks, it was featured most frequently on Fox And context matters. What words are associated with "dossier"? How do the different networks contextualize President Trump's tweets? Another reason I like this example: It points out the trends for the three big networks. So, we aren't a bunch of Marxist professors ragging on FOX, and we ar...

Hedonometer.org

The Hedonometer measures the overall happiness of Tweets on Twitter. It provides a simple, engaging example for Intro Stats since the data is graphed over time, color-coded for the day of the week, and interactive. I think it could also be a much deeper example for a Research Methods class as the " About " section of the website reads like a journal article methods section, in so much that the Hedonometer creators describe their entire process for rating Tweets. This is what the basic table looks like. You can drill into the data by picking a year or a day of the week to highlight. You can also use the sliding scale along the bottom to specify a time period. The website is also kept very, very up to date, so it is also a very topical resource. Data for white supremacy attack in VA In the pages "About" section, they address many methodological questions your students might raise about this tool. It is a good example for the process researchers go ...

Healey's "Study finds a disputed Shakespeare play bears the master's mark"

This story describes how psychologists used content analysis to provide evidence that Shakespeare indeed authored the play Double Falsehood. The play in question has been the subject of literary dispute for hundreds of years. It was originally published by Lewis Theobold in 1727. Theobold claimed it was based on unpublished works by Shakespeare. And literary scholars have been debating this claim ever since. Enter two psychology professors, Boyd and Pennebaker. They decided to tackle this debate via statistics. They conducted a content analysis Double Falsehood as well as confirmed work by Shakespeare. What they tested for: "Under the supervision of University of Texas psychology professors Ryan L. Boyd and James W. Pennebaker, machines churned through 54 plays -- 33 by Shakespeare, nine by Fletcher and 12 by Theobold -- and tirelessly computed each play's average sentence-length, quantified the complexity and psychological valence of its language, and sussed out the ...

Matt Daniel's "The Largest Vocabulary in Hip Hop"

a) The addition of this post means that I now have TWO Snoop Dogg blogg labels for this blog. b) Daniels' graph allows students to see archival data (and research decisions used when deciding how to analyze the archival data as well as content analysis) in order to determine which rapper has the largest vocabulary. Here is Matthew Daniels interactive chart detailing the vocabularies of numerous, prominent rappers. Daniels sampled each musician's first 35,000 lyrics for the number of unique words present. He went with 35,000 in order to compare more established artists to more recent artists who have published fewer songs. (The appropriateness of this decision could be a source of debate in a research methods class.) Additionally, derivatives of the same word are counted uniquely (pimps, pimp, pimping, and pimpin count as four words). This decision was guided, from what I can gather, by the time of content analysis performed. Property of Matthew Daniels...note: The ori...

The Atlantic's "Congratulations, Ohio! You Are the Sweariest State in the Union"

While it isn't hypothesis driven research data, this data was collected to see which states are the sweariest. The data collection itself is interesting and a good, teachable example. First, the article describes previous research that looked at swearing by state (typically, using publicly available data via Twitter or Facebook). Then, they describe the data collection used for the current research: " A new map, though, takes a more complicated approach. Instead of using text, it uses data gathered from ... phone calls. You know how, when you call a customer service rep for your ISP or your bank or what have you, you're informed that your call will be recorded? Marchex Institute , the data and research arm of the ad firm Marchex, got ahold of the data that resulted from some recordings , examining more than 600,000 phone calls from the past 12 months—calls placed by consumers to businesses across 30 different industries. It then used call mining technology to isola...

Burr Settles's "On “Geek” Versus “Nerd”"

Settles decided to investigate the difference between being a nerd and being a geek via a pointwise mutual association analysis (using archival data from Twitter). Specifically, he measured the association/closeness between various hashtag descriptors (see below) and the words nerd and geek. Settles provides a nice description of his data collection and analysis on his blog. A good example of archival data use as well as PMA.

Lord of the Rings Project's Statistics

Hey, nerds. Some big, big nerds generated a bunch of statistical graphs and analyses using content analysis data gleaned from the Tolkien's novels. Teach your students about nerdy, nerdy correlations: Content analysis for positive and negative affect: