Skip to main content

Posts

Priceonomic's Hipster Music Index

This tongue-in-cheek  regression analysis found a way to predict the "Hipster Music Index" of a given artist by plotting # of Facebook shares of said artist's Pitchfork magazine review on they y-axis and Pitchfork magazine review score on the x-axis. If an artist falls above the linear regression line, they aren't "hipster". If they fall below the line, they are. For example, Kanye West is a Pitchfork darling but also widely shared on FB, and, thus demonstrating too much popular appeal to be a hipster darling (as opposed to Sun Kill Moon (?), who is beloved by both Pitchfork but not overly shared on FB). As instructors, we typically talk about the regression line as an equation for prediction, but Priconomics uses the line in a slightly different way in order to make predictions. Also, if you go to the source article, there are tables displaying the difference between the predicted Y-value (FB Likes) for a given artist versus the actual Y-value, which coul...

Hey, girl...(updated 6/25/14)

Updated 6/25/14: Giving credit where credit is due:  http://biostatisticsryangoslingreturns.tumblr.com/ Silly, yes. But if your students can explain why they are funny, it does demonstrate statistical knowledge.

Jess Hartnett's presentation at the 2014 APS Teaching Institute

Hi! Here is my presentation from APS . I am posting it so that attendees and everyone else can have access to the links and examples I used. If you weren't there for the presentation, a warning: It is text-light, so there isn't much of a narrative to follow but there are plenty of links and ideas and some soon-to-be-published research ideas to explore. Shoot me an email (hartnett004@gannon.edu) if you have any questions. ALSO: In the talk I reference the U.S. Supreme Court case Hall v. Florida ( also did a blog entry about this case ). Update: The court decided in the favor of Hall/seemed to understand standard error/made it a bit harder to carry out the death penalty, as discussed here by Slate). Woot woot!

Marketing towards children: Ethics and research

Slate's The Littlest Tasters More research methods than statistics, this article describes the difficulty in determining taste preferences in wee humans who don't speak well if at all. slate.com The goods for teaching: They mention the FACE scale. The research methods described go beyond marketing research and this could be useful in a Developmental class to describe approaches used in data collection for children (like asking parents to rate their children's reactions to foods). I've used this as a discussion board prompt when discussing research ethics, both for simply conducting research with children as well as the ethics of marketing (not so healthy foods) towards children. Aside: They also describe why kids like Lunchables, which has always been a mystery to me. Apparently, kids are picky about texture and flavor but they haven't developed a preference for certain foods to be hot or cold. The Huffington Post's " You'll Never Look at ...

Tyler Vigen's Spurious Correlations

Tyler Vigen has has created  a long list of easy-to-paste-into-a-powerpoint graphs that illustrate that correlation does not equal causation. For instance, while per capita consumption of cheese and number of people who die by become tangled in their bed sheets may have a strong relationship (r = 0.947091), no one is saying that cheese consumption leads to bed sheet-related death. Although, you could pose The Third Variable question to your students for some of these relationships). Property of Tyler Vigens, http://i.imgur.com/OfQYQW8.png Vigen has also provided a menu of frequently used variables (deaths by tripping, sunlight by state) to help you look for specific examples. This portion is interactive, as you and your students can generate your own graphs. Below, I generated a graph of marriage rates in Pennsylvania and consumption of high fructose corn syrup. Generated at http://www.tylervigen.com/

Matt Daniel's "The Largest Vocabulary in Hip Hop"

a) The addition of this post means that I now have TWO Snoop Dogg blogg labels  for this blog. b) Daniels' graph allows students to see archival data (and research decisions used when deciding how to analyze the archival data as well as content analysis) in order to determine which rapper has the largest vocabulary. Here is Matthew Daniels interactive chart detailing the vocabularies of numerous, prominent rappers. Daniels sampled each musician's first 35,000 lyrics for the number of unique words present. He went with 35,000 in order to compare more established artists to more recent artists who have published fewer songs. (The appropriateness of this decision could be a source of debate in a research methods class.) Additionally, derivatives of the same word are counted uniquely (pimps, pimp, pimping, and pimpin count as four words). This decision was guided, from what I can gather, by the time of content analysis performed. Property of Matthew Daniels...note: The ori...

Shameless self-promotion 3

If you are going to the Association for Psychological Science annual convention in San Francisco later this month AND you are attending the Teaching Institute, I will be giving a presentation on Teaching Undergraduates to See Statistics . The talk will feature tips for engaging students via humor and current events AND share some unpublished data about using discussion boards in a statistics classes as well as an activity that introduces students to the growing trend of Big Data. Hope to see some of you there!

Chew and Dillion's "Statistics Anxiety Update Refining the Construct and Recommendations for a New Research Agenda"

Here are two articles, one from The Observer and one from Perspectives on Psychological Science . The PPS article, by Chew and Dillion, is a call for more research to study statistics anxiety in the classroom. Chew and Dillon provide a thorough review of statistics anxiety research, with a focus on antecedents of anxiety as well as interventions (The Observer article is a quick summary of those interventions) and directions for further research. I think Chew and Dillion make a good case for why we should care about statistics anxiety as statistics instructors. As a psychologist who teaches statistics, I find that many of my students are not in math-related majors but can still learn to think like a statistician, in order to improve their critical thinking skills and prepare them for a data/analytic driven world after graduation. However, their free-standing anxiety related to simply being in a statistics class is a big barrier to this and I welcome their suggestions regarding the re...

io9's "The Controversial Doctor Who Pioneered the Idea Of "Informed Consent""

This story describes a 1966 journal article that argues that signing an informed consent isn't the same as truly giving informed consent. I think this is a good example for the ethics section of a research methods class as it demonstrates some deeply unethical situations in which participants weren't able to give informed consent (prisoners, non-English speakers, etc.). Indeed, the context within which the informed consent is provided is very important. It also provides a historical context regarding the creation of Institutional Review Boards. The original 1966 article is here .

SPSS Teaching Memes

When I look at the analytic data for my blog, I notice a lot of people come here after Googling "stats memes" or "math memes" or "statistics humor". Being a data-driven sort of human, I have posted my collection of memes inspired by teaching Introduction to Statistics using SPSS. They do reflect common mistakes/stumbling blocks that I see semester after semester. I think they draw student attention towards commonly-made mistakes in a way that is not threatening. And it puts me one step closer to my ultimate goal of teaching statistics using nothing but memes and animated .GIFS  Make your own via http://memegenerator.net/ . If they are hilarious and statsy, please consider sharing them with me. UPDATE: 11/25/16

Jon Mueller's Correlation or Causation website

If you teach social psychology, you are probably familiar with Dr. Jon Mueller's Resources for the Teaching of Social Psychology website .  You may not be as familiar with Mueller's Correlation or Causation website, which keeps a running list of news stories that summarize research findings and either treat correlation appropriately or suggest/imply/state a causal relationship between correlational variables. The news stories run the gamut from research about human development to political psychology to research on cognitive ability. When I've used this website in the past, I have allowed my students to pick a story of interest and discuss whether or not the journalist in question implied correlation or causation. Mueller also provides several ideas (both from him and from other professors) on how to use his list of news stories in the classroom.

Kevin Wu's Graph TV

UPDATE! This website is not currently available.  Kevin Wu's Graph TV  uses individual episode ratings (archival data via IMDB ) of TV shows, graphs each episode over the course of a series via scatter plot, and generates a regression line. This demonstrates fun with archival data as well as regression lines and scatter plots. You could also discuss sampling, in that these ratings were provided by IMDB users and, presumably, big fans of the shows (and whether or not this constitutes representative sampling). The saddest little purple dot is the episode Black Market. Truth!

mathisfun.com's Standard Normal Distribution Table

Now, I am immediately suspicious of a website entitled "MathIsFun" (I prefer the soft sell...like promising teaching aids for statistics that are, say, not awful and boring). That being said, t his app. from mathisfun.com  may be an alternative to going cross-eyed while reading z-tables in order to better understand the normal distribution. mathisfun.com With this little Flash app., you can select z-scores and immediately view the corresponding portion of the normal curve (either from z = 0 to your z, up to a selected z, or to the right of that z). Above, I've selected z = 1.96, and the outlying 2.5% of the curve is highlighted.  Now, this wouldn't work for a paper and pencil exam (so you would probably still need to teach students to read the paper table) but I think this is useful in that it allows students to IMMEDIATELY see how z-scores and portions of the of the curve co-vary. 

Washington Post's "What your beer says about your politics"

Robinson & Feltus, 2014 There appears to be a connection between political affiliation, likelihood to vote, and preferred adult beverage. If you lean right and drink Cabernet Savignon, you are more likely to vote than one who enjoys "any malt liquor" and leans left.  This Washington Post story summarizes data analysis performed by the  National Media Research Planning and Placement . NMRPP got their data from market research firm Scarborough . There is also a video embedded in the Washington Post story that summarizes the main findings. I think this is a good example of illustrating data as well as data mining pre-existing data sets for interesting trends. And beer.

Hall vs. Florida: IQ, the death penalty, and margin of error (edited 5/27/14)

Here is Think Progress' story about a U.S. Supreme Court case that hinges on statistics. The case centers around death row inmate Freddy Lee Hall. He was sentenced to death in Florida for the murder of Karol Hurst in 1978. This isn't in dispute. What is in dispute is whether or not Hall qualifies as mentally retarded and, thus, should be exempt from the death penalty per Virginia vs. Atkins . So, this is an example relevant to any number of psychology classes (developmental, ethics, psychology and the law, etc.). It is relevant to a statistics class because the main thrust of the argument has to do with the margin of error associated with the IQ test that designated Hall as having an IQ of 71. In order to qualify as mentally retarded in Florida, an individual has to have an IQ of 70 or lower. So, at first blush, Hall is out of luck. Until his lawyers bring up the fact that the margin of error on this test is +/- 5 points. This is a good example of confidence intervals/marg...

UPDATE: The Knot's Infographic: The National Average Cost of a Wedding is $28,427

UPDATE: The average cost of a wedding is now $33,391, as of 2017 . Here is the most up to date infographic: Otherwise, my main points from the original version of this survey are still the same: 1) To-be-weds surveyed for this data come were users of a website used to plan/discuss/squee about pending nuptials. So, this isn't a random survey. 2) If you look at the fine print for the survey, the average cost points quoted come from people who paid for a given service. So, if you didn't have a reception band ($0 spent) your data wasn't used to create the average. Which probably leads to inflation of all of these numbers. _________________________________________ Original Post: This infographic describes the costs associated with an "average" wedding. It is a good example non-representative sampling and bending the truth via lies of omission. For the social psychologists in the crowd, this may also provide a good example of persuasion by establishing ...

Nature's "Policy: Twenty tips for interpreting scientific claims" by William J. Sutherland, David Spiegelhalter, & Mark Burgman

This very accessible summary lists the ways people fib with, misrepresent, and overextend data findings. It was written as an attempt to give non-research folk (in particular, law makers), a cheat sheet of things to consider before embracing/rejecting research driven policy and laws. A sound list, covering plenty of statsy topics (p-values, the importance of replication), but what I really like is that they article doesn't criticize the researchers as the source of the problem. It places the onus on each person to properly interpret research findings. This list also emphasizes the importance of data driven change.

Baby Name Wizard's NameVoyager

UPDATE (12/8/23): YOOOOOOOOO if you got to this post, I suggest that you check out this update for a up-to-date link to this tool. Here is the  Baby Name Wizard's NameVoyager , which provides illustrations of trends in baby names, using data from the 1880s to the present. It is a good tool for demonstrating why graphs can be more engaging than tables when presenting data. When I use this in class, I compare the NameVoyager data display to more  traditionally presented data from the the Social Security Agency . Additionally, I teach in a computer lab, so my students were able to search for their own names, which makes the example more self relevant. Yup. I am one of many, many Jessicas that are around my age.

NPR's "Will Afghan polling data help alleviate election fraud?"

This story details the application of American election polling techniques to Afghanistan's fledgling democracy. Essentially, international groups are attempting to poll Afghans prior to their April 2014 presidential elections as to combat voter fraud and raise awareness about the election. However, how do researchers go about collecting data in a country where few people have telephones, many people are illiterate, and just about everyone is weary about strangers approaching them and asking them sensitive questions about their political opinions? The story also touches on issues of social desirability as well as the decisions  a researcher makes regarding the kinds of response options to use in survey research. I think that this would be a good story to share with a cranky undergraduate research methods class that thinks that collecting data from the undergraduate convenience sample is really, really hard. Less snarkily, this may be useful when teaching multiculturalism or ...

A.V. Club's "Shirley Manson takes BuzzFeed's "Which Alt-Rock Grrrl Are You?" quiz, discovers she's not herself"

Lately, there have been a lot of quizzes popping up on my Facebook feed ("What breed of dog are you?", "What character from Harry Potter are you?"). As a psychologist who tinkers in statistics, I have pondered the psychometric properties of such quizzes and concluded that these quizzes where probably not properly vetted in peer-reviewed journals. Now I have a tiny bit of evidence to support that conclusion. What better way to ensure that a scale is valid than by using the standard of concurrent validity (popular in I/O psychology)? This actually happened when renowned Shirley Manson Subject Matter Expert, Shirley Manson, lead singer of the band Garbage, took the "Which Alt-rock Grrrl are you?" quiz and she didn't score as herself (as she posted on Facebook and reported by A.V. Club ). From Facebook, via A.V. Club An excellent example of an invalid test (or concurrent validity for you I/O types).

Anecdote is not the plural of data: Using humor and climate change to make a statistical point

Variations upon a theme...good for spicing up a powerpoint...inspired by living in the #1 snowiest city (population > 100K, 2014) in the United States. property of xkcd.com https://thenib.com/can-t-stand-the-heat-4d5650fd671b

Time's "Can Time predict your politics?" by Jonathan Haidt and Chris Wilson

This scale , created by Haidt and Wilson, predicts your political leanings based upon seemingly unrelated questions. Screen grab from time.com You can use this in a classroom to 1) demonstrate interactive, Likert-type scales, 2) face validity (or lack there of). I think this would be 3) useful for a psychometrics class to discuss scale building. Finally, the update at the end of the article mentions 4) both the n-size and the correlation coefficient for their reliability study, allowing you discuss those concepts with students. For more about this research, try yourmorals.org

NPR's "In Pregnancy, What's Worse? Cigarettes Or The Nicotine Patch?"

This story discusses the many levels of analysis required to get to the bottom of the hypothesis stated in the title of this story. For instance, are cigarettes or the patch better for mom? The baby? If the patch isn't great for either but still better than smoking, what sort of advice should a health care provider give to their patient who is struggling to quit smoking? What about animal model data? I think this story also opens up the conversation about how few medical interventions are tested on pregnant women (understandably so), and, as such,  researchers have to opt for more observational research studies when investigating medical interventions for protected populations.

Shameless self-promotion 2

Here is a link to a recent co-authored publication that used Second Life to teach students about virtual data collection as well as the broader trend in psychology to study how virtual environments influence interpersonal interactions. Specifically, students replicated evolutionary psychology findings using Second Life avatars. We also discuss best practices for using Second Life in the class room as well as our partial replication of previously established evolutionary psychology findings (Clark & Hatfield, 1989, Buss, Larson, Weston, & Semmelroth, 1992).

Changes in standards for data reporting in psychology journals

Two prominent psychology journals are changing their standards for publication in order to address several long-standing debates in statistics (p-values v. effect sizes and point estimates of the mean v. confidence intervals). Here are the details for changes that the Association for Psychological Science is creating for their gold-standard publication, Psychological Science, in order to improve the transparency in data reporting. Some of the big changes include mandatory reporting of effect sizes, confidence intervals, and inclusion of any scales or measures that were non-significant. This might be useful in class when describing why p-values and means are imperfect, the old p-value v. effect size debate, and how one can bend the truth with statistics via research methodology (and glossing over/completely neglecting N.S. findings). These examples are also useful in demonstrating to your students that these issues we discuss in class have real world ramifications and aren't be...