Skip to main content

Posts

Showing posts from 2015

Hickey's "The 20 Most Extreme Cases Of ‘The Book Was Better Than The Movie"

Data has been used to learn a bit more about the age old observation that books are always better than the movies they inspire. Fivethirtyeight writer Walk Hickey gets down to the brass tacks of this relationship by exploring linear relationships between book ratings and movie ratings.  The biggest discrepancies between movie and book ratings were for "meh" books made into beloved movies (see "Apocalypse Now"). How to use in class: -Hickey goes into detail about his methodology and use of archival data. The movie ratings came from Metacritic, the book ratings came for Goodreads. -He cites previous research that cautions against putting too much weight into Metacritic and Good reads. Have your students discuss the fact that Metacritic data is coming from professional movie reviewers and Goodreads ratings can be created by anyone. How might this effect ratings? -He transforms his data into z-scores. -The films that have the biggest movie:book rati...

Esther Inglis-Arkell's "I Had My Brain Monitored While Looking at Gory Pictures. For Science!"

The writer helped out a PhD candidate by participating in his research, and then described the research process for io9.com readers . I like this because it is describes the research process purely from the perspective of the research participant who doesn't know what the exact hypothesis is. This could be useful for explaining what research participation is like for introductory students. You could used it in a methods class by asking the students to figure out why they used the procedures that they did, and what procedures and scales she describes in her narrative. She describes the informed consent, a personality scale (what do you think the personality scale was trying to assess?), and rating stimuli in two ways (brain scan as well as paper and pencil assessment...why do you think they needed both?) Details to Like: -She is participating is psychology research (neruo. work that may benefit those with PTSD someday) -She describes what is entailed when wearing an elect...

"Guess the Correlation" game

Found this gem, "Guess the Correlation" , via the subreddit r/statistics . The redditor who posted this resource (ow241) appears to be the creator of the website. Essentially, you view different scatter plots and try to guess r . Points are rewarded or taken away based on how close you are to true  r . The game tallies your average amount of error as well. It is way more addictive than it sounds. I think that accuracy increases with time and experience. True r for this one was .49. I guess .43, which isn't so bad. I think this is a good way for statistics instructors to procrastinate. I think it is also a good way to help your students build a more intuitive ability to read scatter plots and predict the strength of linear relationships.

Free, statsy resources available from the Society for the Teaching of Psychology

If you haven't already, please consider joining Teaching of Psychology  (Division 2 of APA). Your membership fees help fund plenty of great initiatives, including: Teaching Statistics and Research Methods: Tips from TOP by Jackson & Grigs This free e-book is a compilation of scholarship of teaching publications. Office of Teaching Resources in Psychology's (OTRP) Teaching Resources This page is divided by topical area in psychology (including Statistics) and includes instructional resources for every topic. Most of the material was created as part of OTRP's Instructional Resource Reward. Among the useful resources are a free booklet containing statistics exercises in both SPSS and R as well as an intense primer on factorial research design . UPDATE (2/24/16): This new resource provides a number of hands-on activities to demonstrate/generate data for all of the concepts typically taught in intro statistics.   Project Syllabus  Project Syllabus is a colle...

Explaining the replication crisis to undergraduates

If you are unaware, Noba Project is a collaboration of many, many psychology instructors who create and make freely available text books as well as stand-alone chapters (modules) that cover a wide variety of psychology topics. You can build a personalized text book AND access test banks/powerpoints for the materials offered. Well, one of the new modules covers the replication crisis in psychology . I think it is thorough treatment of the issue and appropriate for undegraduates.

NFL.com's Football Freakanomics

EDIT: All of this content appears to have been removed from NFL.com. If anyone has any luck finding it, please email me at hartnett004@gannon.edu The NFL and the statistics folks over at Freakonomics got together and made some...learning modules? Let's call them learning modules. They are interactive websites that teach users about very specific questions related to football (like home field advantage , instances when football player statistics don't tell the whole story about a player/team , whether or not firing a head coach improves a failing team , the effects of player injury on team success , etc.) and then answer these questions via statistics. Most of the modules include interactive tables, data, and videos (featuring the authors of Freakanomics) in order to delve into the issue at hand. For example: The Home Field Advantage : This module features a video, as well as a interesting interactive map that illustrates data about the exact sleep lost experienced by ...

Neighmond's "Why is mammogram advice still such a tangle? Ask your doctor."

This news story discusses medical advice regarding dates for recommended annual mammograms for women. Of particular interest for readers of this blog: Recommendations for regular mammograms are moving later and later in life. Because of the very high false positive rate associated with mammograms and subsequent breast tissue biopsies. However, women who have a higher probability (think genetics) are still being advised to have their mammograms earlier in life. Part of the reason that these changes are being made is because previous recommendations (start mammograms at 40) were based on data that was 30-40 years old ( efficacy studies/replication are good things!). Also, I generally love counter-intuitive research findings: I think they make a strong argument for why research and data analysis are so very important. I have blogged about this topic before. This piece by Christy Ashwanden  contains some nice graphs and charts that demonstrate that enthusiastic preventative care ...

Come work with me.

Hi, I wanted to post a blog about a job opportunity that available in my department here at Gannon University . Currently, we are seeking a tenure-track assistant professor who specializes in clinical or counseling psychology and would be interested in teaching theories of personality, psychological assessment, and other specialty undergraduate courses. Gannon is a true undergraduate institution. We teach a 4/4 course load, typically with two and sometimes three unique teaching preps. I started at Gannon in 2009. In that time, I've received $1000s of dollar in internal grant funding to pursue my work in the scholarship of teaching. In addition to supporting the scholarship of teaching, Gannon provides internal support so that faculty can create global education opportunities as well as service learning opportunities for our students. For instance, one of my colleagues is currently writing a proposal for a History of Psychology class that would include an educational trip to E...

Smith's "Rutgers survey underscores challenges collecting sexual assault data."

Tovia Smith filed a report with NPR that detailed the psychometric delicacies of trying to measure the sexual assault rates on a college campus. I think this story is highly relevant to college students. I also think it also provides an example of the challenge of operationalizing variables as well as self-selection bias. This story describes sexual assault data collected at two different universities, Rutgers and U. Kentucky. The universities used different surveys, had very different participation rates, and had very different findings (20% of Rutgers students met the criteria for sexual assault, while only 5% of Kentucky students did). Why the big differences? 1) At Rutgers, students where paid for their participation and 30% of all students completed the survey. At U. Kentucky, student participation was mandatory and no compensation was given. Sampling techniques were very different, which opens the floor to student discussion about what this might mean for the results. Who m...

Barry-Jester, Casselman, & Goldstein's "Should prison sentences be based on crimes that haven't been committed yet?"

This article describes how the Pennsylvania Department of Corrections is using risk assessment data in order to predict recidivism, with the hope of using such data in order to guide parole decisions in the future. So, using data to predict the future is very statsy, demonstrates multivariate modeling, and a good example for class, full stop. However, this article also contains a cool interactive tool, entitled "Who Should Get Parole?" that you could use in class. It demonstrates how increasing/decreasing alpha and beta changes the likelihood of committing Type I and Type II errors. The tool allows users to manipulate the amount of risk they are willing to accept when making parole decisions. As you change the working definition of a "low" or "high" risk prisoner, a visualization will startup, and it shows you whether your parolees stay out of prison or come back. From a statistical perspective, users can adjust the definition of a low, medium, and h...

r/faux_pseudo's "Distribution of particles by size from a Cracker Jack box

I love my fellow Reddit data geeks over at r/dataisbeautiful . Redditor faux_pseudo created a frequency chart of the deliciousness found in a box of Cracker Jacks. I think it would be funny to ask students to discuss why this graph is misleading (since the units are of different size and the pop corn is divided into three columns). You could also discuss why a relative frequency chart might provide a better description. Finally, you could also replicate this in class with Cracker Jacks (one box is an insufficient n-size, after all) or try it using individual servings of Trail Mix or Chex Mix or order to recreate this with a smaller, more manageable sample size. Also, as always, Reddit delivers in the Comments section:

Orlin's "What does probability mean in your profession?"

Math with Bad Drawings is a very accurately entitled blog. Math teacher Ben Orlin illustrates math principles, which means that he occasionally illustrates statistical principles. He dedicated one blog posting to probability, and what probability means in different contexts. He starts out with a fairly standard and reasonable interpretation of p :  Then he has some fun. The example below illustrates the gap that can exist between reality and reporting. And then how philosophers handle probability (with high- p statements being "true"). And in honor of the current Star Wars frenzy: And finally...one of Orlin's Twitter followers, JP de Ruiter , came up with this gem about p -values:

Barry-Jester's "What A Bar Graph Can Tell Us About The Legionnaires’ Outbreak In New York" + CDC learning module

Statistics aficionados over at FiveThirtyEight applied statistics (specifically, tools used by epidemiologists) to the Summer of 2015  outbreak of Legionnaires' Disease  in New York. This story can be specifically used in class as a way of discussing how simple bar graphs can be modified as to display important information about the spread of disease. This news story also includes a link to a learning module  from the CDC. It takes the user through the process of creating an Epi curve. Slides 1-8 describe the creation of the curve, and slides 9-14 ask questions and provide interactive feedback that reinforces the lesson about creating Epi curves. Graphs are useful for conveying data, but even one of our out staples, the bar graph, can be specialized as they share information about the way that disease spread. 1) Demonstrates statistics being used in a field that isn't explicitly statistics-y. 2) A little course online via the CDC for your students to learn to...

U.S. Holocaust Mueseum's "Deadly medicine, creating the master race" traveling exhibit

Alright. This teaching idea is pretty involved. It is bigger than any one instructor and requires interdepartmental effort as well as support from The Powers that Be at your university. The U.S. Holocaust Museum hosts a number of  traveling exhibits . One in particular, " Deadly Medicine: Creating the Master Race ", provides a great opportunity for the discussions of research ethics, the protection and treatment of human research subjects, and how science can be used to justify really horrible things. I am extraordinarily fortunate that Gannon University's Department of History (with assistance from our Honors program as well as College of the Humanities, Education, and Social Sciences) has worked hard to get this exhibit to our institution during the Fall 2015 semester. It is housed in our library through the end of October. How I used it in my class: My Honors Psychological Statistics class visited the exhibit prior to a discussion day about research ethics. In...

An example of when the median is more useful than the mean. Also, Bill Gates.

From Reddit's Instagram...the comments section demonstrates some heart-warming statistical literacy.

How NOT to interpret confidence intervals/margins of error: Feel the Bern edition

This headline is a good example of a) journalists misrepresenting statistics as well as b) confidence intervals/margin of error more broadly. See the headline below: In actuality, Bernie didn't exactly take the lead over Hillary Clinton. Instead, a Quinnipiac poll showed that 41% of likely Democratic primary voters in Iowa indicated that they would vote for Sanders, while 40% reported that they would vote for Clinton. If you go to the original Quinnipiac poll , you can read that the actual data has a margin of error of +/- 3.4%, which means that the candidates are running neck and neck. Which, I think, would have still been a compelling headline.  I used this as an example just last week to explain applied confidence intervals. I also used this as a round-about way of explaining how confidence intervals are now being used as an alternative/compliment to p -values. 

Aschwanden's "Science is broken, it is just a hell of a lot harder than we give it credit for"

Aschwanden (for fivethirtyeight.com) did an extensive piece that summarizes that data/p-hacking/what's wrong with statistical significance crisis in statistics. There is a focus on the social sciences, including some quotes from Brian Nosek regarding his replication work. The report also draws attention to  Retraction Watch  and Center for Open Science as well as retractions of findings (as an indicator of fraud and data misuse). The article also describes our funny bias of sticking to early, big research findings even after those research findings are disproved (example used here is the breakfast eating:weight loss relationship). The whole article could be used for a statistics or research methods class. I do think that the p-hacking interactive tool found in this report could be especially useful illustration of How to Lie with Statistics. The "Hack your way to scientific glory" interactive piece demonstrates that if you fool around enough with your operationalized...

Correlation example using research study about reusable shopping bags/shopping habits

A few weeks ago, I used an NPR story in order to create an ANOVA example for use in class. This week, I'm giving the same treatment to a different research study discussed on NPR and turning it into a correlation example. A recent research study found that individuals who use reusable grocery store bags tend to spend more on both organic food AND junk food. Here is NPR's treatment of the research .  Here is a more detailed account of the research via an interview with one of the study's authors.   Here is the working paper that the PIs have released for even more detail.  The researchers frame their findings (folks who are "good" by using resuable bags and purchasing organic food then feel entitled to indulge in some chips and cookies) via "licensing", but I think this could also be explained by ego depletion (opening up a discussion about that topic). So, I created a little faux data set that replicates the main finding: Folks who use reusable ...

Mersereau's "Wunderground Uses Fox News Graphing Technique to Boast Forecast Skills"

Mersereau, writing for Gawker website The Vane, provides  another example of How Not To Graph. Or How To Graph As To Not Lie About Data But Make Your Data Look More Impressive Than Is Ethical. Weather Underground (AKA Wunderground, weather forecasting service/website) was bragging about it's accuracy compared to the competition. At first glance (see below), this graph seems to reinforce the argument...until you take a look at the scale being used. The beginning point on the X axis is 70, while the high point is 80. So, really, the differences listed probably don't even approach statistical significance. This story, somewhat randomly, also includes some shady graphs created by Fox News. I don't understand the need for the extra Fox News graphs, but they also illustrate how one can create graphs that have accurate numbers but still manage to twist the truth.

Dayna Evans "Do You Live in a "B@%$#" or a "F*%&" State? American Curses, Mapped"

Warning: This research and story include every paint-peeling obscenity in the book. Caution should be used when opening up these links on your work computer and you should really think long an hard before providing these links to your students. However, the research I'm about to describe 1) illustrates z-scores and 2) investigated regional usage of safe-for-the-classroom words like darn, damn, and gosh. So, a linguist, Dr. Jack Grieve  decided to use Twitter data to map out the use of different obscenities by county of the United States. Gawker picked up on this research and created a story about it . How can this be used in a statistics class? In order to quantify greater or lesser use of different obscenities, he created z-scores by county and illustrated the difference via a color-coding system. The more orange, the higher the z-score for a region (thus, greater usage) while blue indicates lesser usage. And, there are three such maps (damn, darn, and gosh) that are safe for us...

Caitlin Dickerson's "Secret World War II Chemical Experiments Tested Troops By Race"

NPR did a series of stories exposing research that the U.S. government conducted during WWII. This research exposed American soldiers to mustard gas for research purposes. In some instances, the government targeted soldiers of color, believing that they had tougher/different skin that would make them more resistant to this form of chemical warfare. Here is the  whole series of stories  (from the  original research, exposed via Freedom of Information Act , to  NPR working to find the effected veterans ). None of the soldiers ever received any special dispensation or medical care due to their involvement. Participants were not given the choice to discontinue participation without prejudice, as recalled below by one of the surviving veterans: "We weren't told what it was," says Charlie Cavell, who was 19 when he volunteered for the program in exchange for two weeks' vacation. "Until we actually got into the process of being in that room and realized, wait a m...

McFadden's "Frances Oldham Kelsey, F.D.A. Stickler Who Saved U.S. Babies From Thalidomide, Dies at 101"

This obituary for Dr. Frances Oldham Kelsey that tells an important story about research ethics, pharmaceutical industries, and the importance of government oversight in the drug creation process ( .pdf here ). Dr. Kelsey, receiving the President's Award for Distinguished Federal Civilian Service (highest honor given to federal employees) Dr.  Kelsey was one of the first officials in the United States to notice (via data!) and raise concerns about thalidomide , the now infamous anti-nausea drug that causes terrible birth defects when administered to pregnant women. The drug was already being widely used throughout Europe, Canada, and the Middle East to treat morning sickness, but Dr. Kelsey refused to approve the drug for widespread use in the US (despite persistent efforts of Big Pharm to push the drug into the US market). Time proved Dr. Oldham Kelsey correct (clinical trials in the US went very poorly), and her persistence, data analysis, and ethics helped to limit the ...

ANOVA example using Patty Neighmond's "To ease pain, reach for your play list."

I often share news stories that illustrate easy-to-follow, engaging research that appeals to undergraduates. For the first time, I'm also providing a mini data set that 1) mimics the original findings and 2) provides an example of ANOVA. This story by Patty Neighmond , reporting for NPR, describes a  study  investigating the role of music in pain reduction. The study used three groups of kids, all recovering from surgery. The kids either 1) listened to music, 2) listened to an audio books, or 3) sat with noise-cancelling ear phones for 30 minutes. The researchers found that kids in both the music and audio book experienced pain reduction levels comparable to over-the-counter pain medication while the control group enjoyed no such benefits. And the research used the 10-point FACES scale, allowing for a side discussion about how we collect data from humans who don't have the best vocabularies or limited communication skills. This study can also be used as a way t...

Memes pertaining to the teaching of statistics, research methods, and undergraduate advising.

For those who teach statistics, research methods, and psychology major advisers. Some of these have been posted before. Some of these have not. I created all of them except for the first one. Additionally, I created a bunch of Psychology Advising memes as I am currently editing the "Advising" portion of my rank and tenure application.

Kristopher Magnusson's "Understanding the t-distribution and its normal approximation"

Once again, Kristopher Magnusson has combined is computer programming and statistical knowledge to help illustrate statistical concepts . His latest  interactive tool allows students to view the t-curve for different degrees for freedom. Additionally, students can view error rates associated with different degrees of freedom. Note that the critical region is one-tailed with alpha set at .05. If you cursor around the critical region, you can set the alpha to .025 to better illustrate a two-tailed test (in terms of the critical region at which we declare significance).  Error rates when n < 30 Error rates when n > 30 This isn't the first time Kristopher's interactive tools have been featured on this blog! He has also created websites dedicated to explaining effect size , correlation , and other statistical concepts .

Aarti Shahani's "How will the next president protect our digital lives?"

I think that it is so, so important to introduce statistics students to the big picture of how data is used in their every day lives. Even with all of the material that we are charged with covering in introduction to statistics, I think it is still important to touch on topics like Big Data and Data Mining in order to emphasize to our students how ubiquitous statistics are in our lives.  In my honors section, I assign multiple readings (news stories, TED talks, NPR stories) prior to a day of discussion devoted to this topic. In my non-honors sections of statistics and my online sections, I've used electronic discussion boards to introduce the topic via news stories. I also have a manuscript in press that describes a way to introduce very basic data mining techniques in the Introduction to Statistics classroom. That's why I think this NPR news story is worth sharing. Shahani describes and provides data (from Pew) to argue that Americans are worried about the security of...

One article (Kramer, Guillory, & Hancock, 2014), three stats/research methodology lessons

The original idea for using this article this way comes from Dr. Susan Nolan 's presentation at NITOP 2015, entitled " Thinking Like a Scientist: Critical Thinking in Introductory Psychology."  I think that Dr. Nolan's idea is worth sharing, and I'll reflect a bit on how I've used this resource in the classroom. (For more good ideas from Dr. Nolan, check out her books, Psychology , Statistics for the Behavioral Sciences , and The Horse that Won't Go Away (about critical thinking)). Last summer, the National Academy of Sciences Proceedings published an article entitled "Experimental evidence of massive-scale emotional contagion through social networks ." The gist: Facebook manipulated participants' Newsfeeds to increase the number of positive or negative status updates that each participant viewed. The researchers subsequently measured the number of positive and negative words that the participants used in their own status updates. They fou...

"Correlation is not causation", Parts 1 and 2

Jethro Waters, Dan Peterson, Ph.D., Laurie McCollough, and Luke Norton made a pair of animated videos ( 1 , 2 ) that explain why correlation does not equal causation and how we can perform lab research in order to determine if causal relationships exist. I like them a bunch. Specific points worth liking: -Illustrations of scatter plots for significant and non-significant relationships. Data does not support the old wive's tale that everyone goes a little crazy during full moons. -Explains the Third Variable problem. Simple, pretty illustration of the perennial correlation example of ice cream sales (X):death by drowning (Y) relationship, and the third variable, hot weather (Z) that drives the relationship. -In addition to discussing correlation =/= causation, the video makes suggestions for studying a correlational relationship via more rigorous research methods (here violent video games:violent behavior). Video games (X) influence aggression (Y) via the moderato...