Monday, September 18, 2017

Yau's "Divorce and Occupation"

Nathan Yau, writing for Flowing Data, provides a good example of correlation, median, and correlation not equaling causation in his story, "Divorce and Occupation".

Yau looked at the relationship between occupation and divorce in a few ways.

He used one of variation upon the violin plot to illustrate how each occupation's divorce rate falls around the median divorce rate. Who has the lowest rate? Actuaries. They really do know how to mitigate risk. You could also discuss why median divorce rate is provided instead of mean divorce rate. Again, the actuaries deserve attention as they probably would throw off the mean.

He also looked at  how salary was related to divorce, and this can be used as a good example of a linear relationship: The more money you make, the lower your chances for divorce. And an intuitive exception to that trend? Clergy members.

Both scatter plots, when viewed at the website, are interactive. By cursoring over any dot, you can see the actual x- and y-axis data for that point.

Also, if you are teaching more advanced students, Yau shares some information on how he created these scatter plots at the end of the article.

Finally, talk to your students about the Third Variable Problem and how correlation does not equal causation. What is causing the relationship between income and divorce? Is it just money? Is it the sort of hours that people work? How does IQ figure into divorce? Maybe it has something to do with the fact that people who seek advanced degrees tend to get married later in life.

Monday, September 11, 2017

Teach t-tests via "Waiting to pick your baby's name raises the risk for medical mistakes"

So, I am very pro-science, but I have a soft spot in my heart for medical research that improves medical outcomes without actually requiring medicine, expensive interventions, etc. And after spending a week in the NICU with my youngest, I'm doubly fond of way of helping the littlest and most vulnerable among us. One example of such an was published in the journal Pediatrics and written up by NPR. In this case, they found that fewer mistakes are made when not-yet-named NICU babies are given more distinct rather than less distinct temporary names. The unnamed baby issues is an issue in the NICU, as babies can be born very early or under challenging circumstances, and the babies' parents aren't ready to name their kids yet. Traditionally, hospitals would use the naming convention "BabyBoy Hartnett" but several started using "JessicasBoy Hartnett" as part of this intervention. So, distinct first and last names instead of just last names. They measured patient mistakes by counting the number of Retract-and-Reorders, or how often a treatment was listed in a patient’s record, then deleted and assigned to a different patient (due to a mistake being corrected). They found that the number of retract-and-reorders decreased following the naming convention change.

This researcher DID NOT use paired t-tests their analyses. However, this research presents a good conceptual example of within subject t-tests. As I often do around this blog, I created fake t-test data that mimicked the findings with fewer R-and-R's for doubly named babies. The data was created via Richard Lander’s data generator website:

Before intervention After Intervention
NICU 1 47 36
NICU 2 45 26
NICU 3 52 38
NICU 4 50 32
NICU 5 46 42
NICU 6 38 20
NICU 7 63 41
NICU 8 40 27
NICU 9 37 26
NICU 10 40 29

Monday, August 28, 2017

The Hedonometer measures the overall happiness of Tweets on Twitter.

It provides a simple, engaging example for  Intro Stats since the data is graphed over time, color coded for day of the week, and interactive. I think it could also be a much deeper example for a Research Methods class as the "About" section of the website reads like a journal article  methods section, in so much that the Hedonometer creators describe their entire process for rating Tweets.

This is what the basic table looks like. You can drill into the data by picking a year or a day of the week t o highlight. You can also use the sliding scale along the bottom to specify a time period.

The website is also kept very, very up to date, so it is also a very topical resource.

Data for white supremacy attack in VA
Data for white supremacy attack in VA

In the pages "About" section, they address many methodological questions your students might raise about this tool. It is a good example for the process researchers go through when making judgement calls regarding the operationalization of their variables.:

In order to determine the happiness of any given word, they had to score the words. Here are their scores, which they provide:

Photo of word happiness ratings/
They describe how they rated the words, which gives your students an example how to use mTurk in research:
Description of how they rated the individual words:
 They also describe a short coming of the lexical ratings: Good events that are associated with very, very bad events:
Why bin Laden's death received low happiness ratings:
 They also describe their exact sample:
Hedonometer sampling -

Monday, August 21, 2017

Sonnad and Collin's "10,000 words ranked according to their Trumpiness"

I finally have an example of Spearman's rank correlation to share.

This is a political example, looking at how Twitter language usage differs in US counties based upon the proportion of votes that Trump received.

This example was created by Jack Grieves, a linguist who uses archival Twitter data to study how we speak. Previously, I blogged about his work that analyzed what kind of obscenities are used in different zip codes in the US. And he created maps of his findings, and the maps are color coded by the z-score for frequency of each word. So, z-score example.

Southerners really like to say "damn". On Twitter, at least.

But on to the Spearman's example. More recently, he conducted a similar analysis, this time looking for trends in word usage based on the proportion of votes Trump received in each county in the US. NOTE: The screen shots below don't do justice to the interactive graph. You can cursor over any dot to view the word as well as the correlation coefficient. Grieve performed a Spearman's correlation. He ran the correlation by rank ordering 1) the 10,000 most commonly tweeted words and 2) the "level of Trump support in  US counties" was measured as percentage of the vote for Trump (thanks for replying to my email, Jack!), with positive correlations indicating a positive relationship between Trump support and word usage. See below:

Trump supporting counties are going for the soft swears.

And Clinton leaning counties don't give a  f*ck, which may be because they've had one to many beers.

So, there is a lovely, interactive piece that lists words and the correlation coefficient for the relationship between that word and support for Trump. Grieves speculates that this data points to an urban/rural divide in Trump support.

Also of note, the data was collected two years before the election, so no "Bad Hombres", "Snowflakes, "She Persisted", "Winners", etc. showed up  in this data, so it might be a snapshot of the differences that lead up to the current, rather divided electorate.

Monday, August 14, 2017

The Economists' "Ride-hailing apps may help to curb drunk driving"

I think this is a good first day of class example.

It shows how data can make a powerful argument, that argument can be persuasively illustrated via data visualization, AND, maybe, it is a soft sell of a way to keep your students from drunk driving. It also touches on issues of public health, criminal justice, and health psychology.

This article from The Economist succinctly illustrates the decrease in drunk driving incidents over time using graphs.

This article is based on a working paper by PhD student Jessica Lynn (name twin!) Peck.

Graphs of drunk driving accidents x time
Also, maybe your students could brainstorm third variables that could explain the change. Also, New Yorkers: What's the deal with Staten Island? Did they outlaw Uber? Love drunk driving? 

Monday, August 7, 2017

Kim Kardashinan-West, Buzzfeed, and Validity

So, I recently shared a post detailing how to use the Cha-Cha Slide in your Intro Stats class.

Today? Today, I will provide you with an example of how to use Kim Kardashian to explain test validity.

So. Kim Kardashian-West stumbled upon a Buzzfeed quiz that will determine if you are more of a Kim Kardashian-West or more of a Chrissy Teigen. She Tweeted about it, see below.

And she went and took the test, BUT SHE DIDN'T SCORE AS A KIM!! SHE SCORED AS A CHRISSY! See below.

So, this test purports to assess one's Kim Kardashian-West-ness or one's Chrissy Teigan-ness. And it failed to measure what it claimed to measure as Kim didn't score as a Kim. So, not a valid measure. No word on how Chrissy scored.

And if you are in you teach people in their 30s, you could always use this example of the time Garbage's Shirley Manson did not score as Shirley Manson on an online quiz. 

Monday, July 31, 2017

Hickey's "The Ultimate Playlist Of Banned Wedding Songs"

I think this blog just peaked. Why? I'm giving you a way to use the Cha-Cha-Slide ("Everybody clap your hands!") as a tool to teach basic descriptive statistics.

Most Intro Stats teachers could use this within the first week of class, to describe rank order data, interval data, qualitative data, quantitative data, the author's choice of percentage frequency data instead of straight frequency.

Additionally, Hickey, writing for fivethirtyeight, surveyed two dozen wedding DJs about banned songs at 200 weddings. So, you can chat about research methodology as well. 

Finally, as a Pennsylvanian, it makes me so sad that people ban the Chicken Dance! How can you possibly dislike the Chicken Dance enough to ban it? Is this a class thing? 

Monday, July 24, 2017

de Frieze's "‘Replication grants’ will allow researchers to repeat nine influential studies that still raise questions"

In my stats classes, we talk about the replication crisis. When introducing the topic, I use this reading from NOBA. I think it is also important for my students to think about how science could create an environment where replication is more valued. And the Dutch Organization for Scientific Research has come up with a solution: It is providing grants to nine groups to either 1) replicate famous findings or 2) reanalyze famous findings. This piece from Science details their efforts.

The Dutch Organization for Scientific Research provides more details on the grant recipients, which include several researchers replicating psychology findings:

How to use in class: Again, talk about the replication crisis. Ask you students to generate ways to make replication more valued. Then, give them a bit of faith in psychology/science by sharing this information on how science is on it. From a broader view, this could introduce the idea of grants to your undergraduates or get your graduate students thinking about new avenues for getting their replications funded.

Monday, July 17, 2017

Harris's "Scientists Are Not So Hot At Predicting Which Cancer Studies Will Succeed"

This NPR story is about reproducibility in science that ISN'T psychology, the limitations of expert intuition, and the story is a summary of a recent research article from PLOS Biology (so open science that isn't psychology, too!).

Thrust of the story: Cancer researchers may be having a similar problem to psychologists in terms of replication. I've blogged this issue before. In particular, concerns with replication in cancer research, possibly due to the variability with which lab rats are housed and fed.

So, this story is about a study in which 200 cancer researchers, post-docs, and graduate students took a look at six pre-registered cancer study replications and guessed which studies would successfully replicate. And the participants systematically overestimated the likelihood of replication. However, researchers with high h-indices, were more accurate that the general sample. I wonder if the high h-indicies uncover super-experts or super-researchers who have been around the block and are a bit more cynical about the ability of any research finding to replicate.

How to use in a stats class: False positives: The original research didn't replicate (this time, maybe) AND that the experts judging replicability were overly optimistic. Also, one might wonder if there are potential cancer treatments that we don't know about because of false negatives.

How to use in a research class: The lack of reproduction may signal evidence of the publication bias. Replication is necessary for good science. Experts aren't perfect.

Monday, July 10, 2017

Domonoske's "50 Years Ago, Sugar Industry Quietly Paid Scientists To Point Blame At Fat"

This NPR story discusses research detective work published JAMA. The JAMA article looked at a very influential NEJM review article that investigated the link between diet and Coronary Heart Disease. Specifically, whether sugar or fat contribute more to CHD. The article, written by Harvard researchers decades ago, pinned CHD on fatty diets. But the researchers took money from Big Sugar (which sounds like...a drag queen or CB handle) and communicated with Big Sugar while writing the review article.

This piece discusses how conflict of interest shaped food research and our beliefs about the causes of CHD for decades. And how conflict of interest and institutional/journal prestige shaped this narrative. It also touches on how industry, namely sugar interests, discounted research that finds a sugar:CHD link while promoting and funding research that finds a fat:CHD link.

How to use in a Research Methods class:
-Conflict of interest. The funding received by the researchers from the sugar lobby was never fully disclosed. Sugar lobby communicated with the authors of the original research while they were writing the review article.
-Article of ill repute was a literature review. Opens up the conversation on how influential review papers are. Especially when the authors are from well-reputed institutions and they are printed in well-reputed journals.
-A good example of cherry picking data. Articles critical of sugar where held to a different standard.
-I am a psychologist. I discuss the replication crisis in psychology, but other fields (here, nutrition and heart diseaseresearch) are susceptible to zeitgeist as well.

Monday, July 3, 2017

Chris Wilson's "The Ultimate Harry Potter Quiz: Find Out Which House You Truly Belong In"

Full disclosure: I have no chill when it comes to Harry Potter.

Despite my great bias, I still think this pscyometrically-created (with help from psychologists and Time Magazine's Chris Wilson!) Hogwart's House Sorter is a great example for scale building, validity, descriptive statistics, electronic consent, etc. for stats and research methods.

How to use in a Research Methods class:

1) The article details how the test drew upon the Big Five inventory. And it talks smack about the Myers-Briggs.

2) The article also uses simple language to give a rough sketch of how they used statistics to pair you with your house. The "standard statistical model" is a regression line, the "affinity for each House is measured independently", etc.

While you are taking the quiz itself, there are some RM/statsy lessons:

3) At the end of the quiz, you are asked to contribute some more information. It is a great example of a leading response options as well as implied, electronic consent.

4) The quiz provides descriptive statistics of how well you fit into each House:

5) There is a debriefing:

This isn't the first time I've posted about Chris Wilson's statsy interactive pieces for Time magazine.

Teach Least Squared Error, trends over time, archival data sets via this feature that finds the British equivalent of your first name based on the popularity of your name when you were born versus the same ranked name in England. Bonus: Your students can find out their British name. Mine is Shannon.

Teach percentiles, medians, and I/O's Holland Inventory with this data investigating the relationship between job salary AND Holland personality match for the job. Spoiler alert: This data also provides an example of a non-significant correlation. Bonus: Your students can find out their own Holland Inventory type.

Monday, June 26, 2017

APA's "How to Be A Wise Consumer of Psychological Research"

This is a nice, concise hand out from APA that touches on the main points for evaluating research. In particular, research that has been distilled by science reporters.

It may be a bit light for a traditional research methods class, but I think it would be good for the research methods section of most psychology electives, especially if your students working through source materials.

The article mostly focuses on evaluating for proper sampling techniques. They also have a good list of questions to ask yourself when evaluating research:

This also has an implicit lesson of introducing the APA website to psychology undergraduates and the type of information shared at (including, but not limited to, this glossary of psychology terms.)

Monday, June 19, 2017

Winograd's Personality May Change When You Drink, But Less Than You Think

How much do our personalities change when we're drunk? Not as much as we think. We know this due to the self-sacrificing research participants who went to a lab, filled out some scales, got drunk with their friends. For science!

Here is the research, as summarized by the first authorHere is the original study.

This example admittedly panders to undergraduates. But I also think it is an example that will stick in their heads. It provides good examples of:

1) Self-report vs. other-report personality data in research.
-Two weeks prior to the drinking portion, participants completed a Big Five personality scale as if they were drunk. So, there is the self-report of Drunk!Participant. And during the drinking session, participants had their Big Five judged by research assistants coding their interactions with friends, allowing a more object judgment of the Drunk!Participant.

The findings:

-Why do we need self and other reports? What sort of traits are people most likely to lie about? This could also open up a conversation about Lie scales, especially their use in situations when their is pressure to present well, like during job interviews.

-What other sort of other-reports have your students seen used in research? I've seen research that asks teachers to evaluate students, parents to evaluate children, etc. When might an acquaintance be a better source of data than a stranger?

2) Conceptual examples of repeated measure/within subject t-test and paired-participant/between subjects t-test.

-At Time 1, Ps reported their personality under normal circumstances, and what they think think of their personalities when drunk. Within-subject t-test. Results: Ps believe that their personalities change substantially when drunk.

-At Time 2, while the participants were drunk, they were observed by research assistants. The research assistants made their best guesses at the Ps Big Five. Between-subject, matched t-test. Results: P extroversion seems to increase, but raters didn't find any other increases.

3) Example of using the Big Five in research.

Monday, June 5, 2017

Brenner's "These Hilariously Bad Graphs Are More Confusing Than Helpful"

Brenner, writing for Distractify, has compiled a very healthy list of terrible, terrible graphs and charts. How to use in class:
1) Once you know how NOT to do something, you know how to do it.
2) Bonus points for pointing out the flaws in these charts...double bonus points for creating new charts that correct the incorrect charts.

A few of my favorites:

Monday, May 29, 2017

Daniel's "Where Slang Comes From"

I think that language is fascinating. Back when I taught developmental, I always liked to teach how babies learn to talk in sort of the same way all across the world. I like regional difference in American English (for example, swearing and regional colloquialisms). So, I really like this research that investigates the rise and fall of slang in America. And I think it could be used in a statistics class.

How to use in class?

1. Funny list of descriptive statistics.

2. Research methodology for using Google searches to answer a question. A good opening for discussion of archival data, data mining, and creating inclusion criteria for research methodology.

3. Using graphs to illustrate trends across time. This feature is interactive.

4. Further interactive features demonstrating how heat maps can be used to demonstrate state-by-state popularity over time. Here, "dank memes" peaked in April 2016 in Montana.

5. The author eye-balled the data can came up with common origins of slang: Hip-hop music, politics, "the internets" (technology). This reminds me, conceptually, of cluster analysis. Note: NO CLUSTER ANALYSIS was conducted to come up with the three slang origin categories.

Monday, May 22, 2017

Trendacosta's Mathematician Boldly Claims That Redshirts Don't Actually Die the Most on Star Trek

io9 recaps a talk given by mathematician James Grime. He addressed the long running Star Trek joke that the first people to die are the Red Shirts. Using resources that detail the ins and outs of Star Trek, he determined that:

This makes for a good example of absolute vs. relative risk. Sure, more red shirts may die, absolutely, but proportionally? They only make up 10% of the deaths. Also, I think this is a funny example of using archival data in order to understand an actual on-going Star Trek joke.

For more math/Star Trek links, go to's treatment of the speech.

Monday, May 15, 2017

Pew Research Center's Methods 101 Video Series

Pew Research Center is an excellent source for data to use in statistics and research methods classes. I have blogged about them before (look under the Label pew-pew!) and I'm excited to share that Pew is starting up a series of videos dedicated to research methods. The new series will be called Methods 101.

The first describes sampling techniques in which weighing is used to adjust imperfect samples as to better mimic the underlying population. I like that this is a short video that focuses on one specific aspect of polling. I hope that they continue this trend of creating very specific videos covering specific topics.

Looking for more videos? Check out Pew's YouTube Channel. Also, I have a video tag for this blog.

Monday, May 8, 2017

Daniel's "Most timeless songs of all time"

This article, written by Matt Daniels for The Pudding, allows you to play around with a whole bunch of Spotify user data in order to generate visualizations of song popularity over time. You can generate custom visualizations using the very interactive sections on this website. For instance, there is a special visualization that allows you to finally quantify the Biggie/Tupac Rivalry.

So, data and pop culture are my two favorite things. I could play with these different interactive pieces all day long. But there are also some specific ways you could use this in class.

1) Generate unique descriptive data for different musicians and then ask you students to create visualizations using the software of your choosing. Below, I've queried Dixie Chicks play data. Students could enter their own favorite artist. Note: They data only runs through 2005.

2) Sampling errors: Here is a description of the methodology used for this data:

Is this representative of all data? What does he mean by "normalize the data" as a way to correct the data? Where could we collect data as to have a more representative sampling? Would Sirus skew older? What about iTunes?

3) Using data mining/archival data to generate insights into research questions.

Here, the question explored in this article is, "What is the difference between a flash in the pan song versus a song for the ages?".

Here, data from 2013 hits has been tracked. And it founds that the post-hit plateau is a good indicator of music that will have longer staying power. Here, event though Daft Punk's Get Lucky peaked much higher than Onerepublic's Counting Stars, Counting Starts has a higher plateau. Also, note that with this interactive piece, students could select any number of songs to compare.

Monday, May 1, 2017

"Student life summarized using graphs" video

I found this video at the Student Problems Page on Facebook. I don't know who to attribute it to, but it was probably a smart, sarcastic Intro Stats student.

Monday, April 24, 2017

NYT's "You Draw It" series

As I've discussed in this space before, I think that it is just as important to show our students how to use statistics in real life as it is to show our students how to conduct an ANOVA.

The "You Draw It" series from the New York Times provides an interactive, personalized example of using data to prove a point and challenge assumptions. Essentially, this series asks you to predict data trends for various social issues. Then it shows you how the data actually looks. So far, there are three of these features: 1) one that challenges assumptions about Obama's performance as president, 2) one that illustrates the impact of SES on college attendance, and 3) one that illustrates just how bad the opiod crisis has become in our country.

Obama Legacy Data

This "You Draw It" asks you to predict Obama's performance on a number of measures of success. Below, the dotted yellow line represents my estimate of national debt under Obama. The blue line shows true national debt under Obama. Note: With this tool, you trace your trend line on the graph, press a button, and then the actual data pops up, as well as discussion about the actual data.

We can use this data to see how political affiliation influences assumptions about the Obama presidency. This one can be used both ways: Right-leaning users may assume the worse while left-leaning users assume the best.

How Family Income Affects Children's College Chances

This example uses data to touch on a social justice issue: Whether or not a college education is really accessible to everyone. After you enter your estimate and see the real data, the website returns normative data about performance on the task and how you compare to other users. Below, the dotted line represents the actual data, and my guess was the solid line.

I think this would be useful in a class on poverty and as an example of a linear relationship.

Drug Overdose Epidemic

This example would be good for a clinical psychology, addiction, criminal justice, or public health class. It asks the user to guess number of deaths due to car accident deaths, gun deaths, and HIV deaths in the US. Finally, it asks you to estimate deaths due to drug overdoses. Which have sky rocketed in the last 20 years (see below).

Then it contrasts drug overdose deaths with car accidents, guns, and HIV. This example may also be useful for social psychology, as it hints at the availability heuristic.

How to use in class:
1) Non-statisticians using statistics to tell a story.
2) Using clever visualization to tell a story.
3) The interactive piece here really forces you to connect to the data and be proven right or wrong.

Monday, April 17, 2017

Sense about Science USA: Statistics training for journalists

In my Honors Statistics class, we have days devoted to discussing thorny issues surround statistics. One of these days is dedicated to the disconnect between science and science reporting in popular media.

I have blogged about this issue before and use many of these blog posts to guide this discussion: This video by John Oliver is hilarious and touches on p-hacking in addition to more obvious problems in science reporting, this story from NPR demonstrates what happens when a university's PR department does a poor job of interpreting research results. The Chronicle covered this issue, using the example of mis-shared research claiming that smelling farts can cure cancer (a student favorite), and this piece describes a hoax that one "researcher" pulled in order to demonstrate how quickly the media will pick up and disseminate bad-but-pleasing research to the masses.

When my students and I discuss this, we usually try to brain storm about ways to fix this problem. Proposed solutions: Public shaming of bad journalists, better editing of news stories before they are published, a prestigious award system for accurate science writing. And another idea my students usually arrive upon? Better training for journalists.

So, you can imagine how pleased I was to discover that such classes already exist via Sense about Science USA.

Their mission:

They support this mission in a few different ways. They advocate for registering all medical trials conducted on humans. They are training scientists to more effectively communicate their findings to the public. And, apropos of this blog, they are also training journalists to better understand statistics AND offer one-on-one consulting to journalists trying to understand data.

Here is their description of why it is important to better train journalists.

How to use in class:

1) Instead of just showing students the problems associated with poor science writing, let's show them a possible solution as well.
2) Statistics isn't just for statisticians, statistics are for anyone who wants to better understand policy issues, emerging research, and evidence-based practices in their field.
3) Show your students some examples of poor science writing. Have them develop a brief presentation that would address the most common statistical mistakes made by science writers.

Monday, April 10, 2017

Reddit's data_irl subreddit

You guys, there is a new subreddit just for sharing silly stats memes. It is called r/data_irl/.

The origin story is pretty amusing.

I have blogged about the subreddit r/dataisbeautiful previously. The point of this sub is to share useful and interesting data visualizations. The sub has a hard and fast rule about only posting original content or well-cited, serious content. It is a great sub.

But it leaves something to be desired. That something is my deep desire to see stats jokes and memes.

On April Fool's Day this year, they got rid of their strict posting rules for a day and the dataisbeautiful crowd provided lots of hilarious stats jokes, like these two I posted on Twitter:

The response was so strong, because there are so many of people that love stats memes, that a new sub was started, data_irl JUST TO SHARE SILL STATS GRAPHICS. It feels like coming home to my people. 

Monday, April 3, 2017

Day's Edge Production's "The Snow Guardian"

A pretty video featuring Billy Barr, a gentleman that has been recording weather day in his corner of Gothic, Colorado for the last 40 years. 

Billy Barr
This brief video highlights his work. And his data provides evidence of climate change. I like this video because it shows how ANYONE can be a statistician, as long as...

They use consistent data collection tools...

They are fastidious in their data entry techniques...

They are passionate about their research. Who wouldn't be passionate about Colorado?