Monday, February 23, 2015

Amanda Aronczyk's "Cancer Patients And Doctors Struggle To Predict Survival"

Warning: This isn't an easy story to listen to, as it is about life expectancy and terminal cancer (and how doctors can best convey such information to their patients). Most of this news story is dedicated to training doctors on the best way to deliver this awful news.

 But Aronczyk, reporting for NPR, does tell a story that provides a good example of high-stakes applied statistics. Specifically, when explaining life expectancy to patients with terminal cancer, which measure of central tendency should be used? See the quote from the story below to understand where confusion and misunderstanding can come from measures of central tendency.

"The data are typically given as a median, which is different from an average. A median is the middle of a range. So if a patient is told she has a year median survival, it means that half of similar patients will be alive at the end of a year and half will have died. It's possible that the person's cancer will advance quickly and she will live less than the median. Or, if she is in good health and has access to the latest in treatments, she might outlive the median, sometimes by many years.
Doctors think of the number as a median, but patients usually understand it as an absolute number, according to Dr. Tomer Levin, a psychiatrist who works with cancer patients and doctors at Memorial Sloan Kettering Cancer Center in New York. He thinks there is a breakdown in communication between the doctor and patient when it comes to the prognostic discussion."

A couple of ways this could be used as a discussion starter:

1) How could a doctor best describe life expediencies? What may be more useful? Interquartile range? A mean and standard deviation? Range? What is the simplest way to explain these measures to a person receiving horrible news?

2) This could also be useful in a cognitive/memory class, as the story refers to research that has found that cancer patients retain little of the information they receive when they get their diagnosis. How can statistical information be conveyed in an understandable manner to individuals who are experience enormous stress?

Monday, February 16, 2015

Philip Bump's "How closely do members of congress align with the politics of their district? Pretty darn close."
Philip Bump (writing for The Washington Post) illustrates the linear relationship between a U.S. House of Representative Representative's politics and their home district's politics. Yes, this is entirely intuitive. However, it is still a nice example of correlations/linear relationships for the reasons described below.

Points for class discussion:

1) How do they go about calculating this correlation? What are the two quantitative variables that have been selected? Via legislative rankings (from the National Journal) on the y-axis and voting patterns from the House member's home district on the x-axis.

2) Several outliers' (perhaps not mathematical outliers, but instances of Representative vs. District mismatch ) careers are highlighted within the news story in order to explain why they don't align as closely with their districts.

3) Illustrates a linear relationship. Illustrates outliers. Illustrates political data. Accessible example for your students.

Wednesday, February 11, 2015

Pew Research Center's "Major Gaps Between the Public, Scientists on Key Issues"

This report from Pew  highlights the differences in opinions between the average American versus members of the American Association for the Advancement of Science (AAAS). For various topics, this graph reports the percentage of average Americans or AAAS members that endorse each science related issues as well as the gap between the two groups. Below, the yellow dots indicate the percentage of scientists that have a positive view of the issue and the blue indicate the same data for an average American.

If you click on any given issue, you see more detailed information on the data.

In addition to the interactive data, this report by Funk and Rainie summarizes the main findings. You can also access the original report of this data (which contains additional information about public perception of the sciences and scientists).

This could be a good tool for a research methods/statistics class in order to convince students that learning about the rigors of the scientific method/hypothesis testing do change the way people evaluate information. It is also a good example of simple descriptive data that students can play with via the interactive interface.

Monday, February 9, 2015

Anya Kamenetz's "The Past, Present, And Future of High-Stakes Testing"

Kamenetz (reporting for NPR) talks about her book, Test, which is about the extensive use of standardized testing in our schools. Largely, this is a story about the impact these tests have had on how teachers instruct K-12 education in the US. However, a portion of the story discusses alternatives to annual testing of every student. Alternatives include using sampling to assess a school as well as numerous alternate testing methods (stealth testing, assessing child emotional well-being, portfolios, etc.). Additionally, this story touches on some of the implications of living in a Big Data society and what it is doing to our schools.

I think this would be a great conversation starter for a research methods or psychometric course (especially if you are teaching such a class for a School of Education). What are we trying to assess: Individual students or teachers or schools? What are the benefits and short comings of these different kinds of assessments? Can you students come up with additional alternatives to annual, all-student testing?

Wednesday, February 4, 2015

Beyond SPSS (revised 2/13/2105)

I'm an SPSS girl. I sit in my Psychology Department ivory tower and teach Introduction to Statistics via SPSS.

SPSS isn't the only way to do the statistics. In fact, it is/has been losing favor among "real" statisticians. I recently had a chat with a friend who has a Ph.D. in psychology and works as a statistician. She told me that statsy job postings rarely ask for SPSS skills. Instead, they are seeking people who know R and/or Python.

In order to better help our data-inclined students find work, I've gathered some information on learning R and Python. This probably isn't for every student. This probably isn't for 90% of our students. However, it may be helpful for an outstanding undergraduate or graduate student who is making noise like they want a data/research oriented career. Alternately, I think that an R class could be a really cool upper-level undergraduate elective for a select group of students.

Also, if anyone is brave enough to teach their undergraduate statistics students R, email me, I would love to pick your brain (

Note: I have not tried out all of the resources I am listing (ain't nobody got time for that) but they ARE all interactive. Some require registration, some don't. All are free. Some are brief, some require several hours to complete



UPDATE: I received good feedback and suggestions from my blog readers (see below).

-Via Twitter, Michael Philipps suggested JASP Statistics, free data analysis software that acts a lot like SPSS.

-Via the Comment section, Juanjo Medina suggested this blog posting by Jeromy Anglim for more information on switching to R.

Another resource for to get us as well as our students up to speed on different software and statistical techniques is Coursera, home of many a free MOOC. Here is a link to statistics and data analysis classes that will be taught in English. The classes cover an array of topics, including R as well as specialization topics in statistics.

Monday, February 2, 2015

Khan Academy's #youcanlearnanything

Khan has been providing high-quality videos explaining...indeed...everything for a while now. Among everything are Probability and Statistics.

Recently, they reorganized their content and added assessment tools as part of their #youcanlearnanything campaign in order to create self-paced lessons that are personalized to the user and include plenty of videos (of course) and personalized quizzes and feedback.

1) It requires the creation of a free account and selection of a learning topic (the screen shots below are from the Statistics and Probability course).

2) When you start a topic, you take pre-test to assess your current level. This assessment covers simple chart reading, division, and multiplication required for more advanced topics. If you struggle with this, Khan provides you with more material to improve your understanding of these topics.

3) After you complete the assessment, you receive your lesson plan. It includes the topic you select plus an additional introductory math material that you need in order to understand statistics and probability.

Screen shot of the main lesson plan page. Note that this page allows you to track your progress and receive feedback on your strengths and weaknesses.
4) For each skill lesson, you can watch a traditional Khan video. As you answer questions, there are plenty of hints available to take you through the problem and there is also a "scratch pad" feature that would probably work best with a tablet.

What the individual lessons look like (here, Mean, median, mode). Helpful features include explanatory video and the "hint" option to be guided through solving the problem.
While I don't think that this particular product would be useful in class I think this could be a useful supplement for a student who is struggling with statistics and looking for extra material to master skills. Another option would be to list this on a syllabus or LMS page for supplemental materials that students can use for their own studying.