Tuesday, June 7, 2022

Data collection via wearable technology

This article from The Economist, "Data from wearable devices are changing disease surveillance and medical research," has a home in your stats or RM class. It describes how FitBits and Apple Watches can be used to collect baseline medical data for health research. I like it because it is very accessible but still goes into detail about specific research issues related to this kind of data:

-How does one operationalize their outcome variable? Pulse, temperature, etc., as proxies for underlying problems. Changes in heart rates have predicted the onset of COVID and the flu. 

When it comes to disease surveillance, the most useful biomarker is fever, a direct sign of infection. But most wearables do not measure temperature, because accurate readings are hard to do. So a proxy had to be created using the standard things they do measure, such as heart rate, sleep and activity level. Resting heart rate, measured when people are sitting still, varies a lot from person to person—anything between 50 and 100 beats per minute counts as normal—but each person’s rate is generally stable. When the body fights an infection, however, the rate goes up, often dramatically. With covid-19, data from wearable devices showed that this uptick happened four days before people felt any symptoms. By one estimate 63% of covid cases could be detected from changes in resting heart rate before the onset of symptoms.

-Big samples be good! One of the reasons this data works like it does is because it is harvested from a massive number of people using these devices. 

-The article gives examples of well-designed experiments that use wearable technology. However, often with massive data collection via tech, the data drives the hypothesis, not the other way around. In our psychology classes, we discuss NHST and the proper way to create and test a hypothesis. As we should. But sometimes, the data establish the hypothesis. It is essential to discuss this because it happens a lot in non-social science research. For instance:

Another mysterious finding is that Germans in all parts of the country are sleeping less in 2022 than in 2020 and the resting heart rate of the nation has gone up. One guess is that this may have to do with the extra weight that people put on during lockdowns, but nobody really knows for sure. The data from wearables has been “a question generator”, says Mr Brockmann, raising queries about health that would not have been asked otherwise.

-Ecological validity: Wearable technology allows researchers to monitor participants as they move through their daily lives. 

Most important, devices that unobtrusively monitor patients as they go about their lives have allowed medical researchers to see, for the first time, how patients experience a given disease and treatment in their natural habitat. Nobody sleeps well in a pharmaceutical company’s sleep lab. The most widely used test of cardiovascular and physical fitness is the “six-minute-walk test”, which is the distance that someone can walk in the span of six minutes. It involves a patient pacing up and down a hospital corridor while a nurse with a clipboard records the result.

-Teach your students the pain of recruiting participants and how this method helps.

Dr Ashley’s group was among the first to run, in 2019, a fully digital trial in which participants never met a researcher face-to-face. Not long ago, he says, recruiting trial participants involved putting up posters with tear-off bits of paper listing a number for them to call. They would then need to go to the hospital and sit down with a nurse to go over 17 pages of consent forms to sign up. “If you could get 200 people in a few months, you’d be pretty happy,” he says.   Now, people can download the app for a study and sign up while waiting in line for their coffee. The first time Dr Ashley’s team used this method for a study on physical activity 40,000 people enrolled in just two weeks and results were ready in a matter of months. That was not an unalloyed benefit. Though the study was very easy to join, it was also very easy to leave and about 80% of participants had dropped out before the end, which was just two weeks in. Even so, the final group was about ten times the usual size for this line of research.

-Bias in data collection: Who can afford wearable technology? Perhaps some of the patterns detected by wearable tech are only applicable to the privileged people who use them:

One thing researchers now need to work out is whether the disease-surveillance algorithms based on wearable devices might systematically miss what is happening with some types of people, says Leo Wolansky from the Rockefeller Foundation’s Pandemic Prevention Institute. For example, algorithms might unwittingly be optimised for spotting outbreaks in wealthy areas where people are more likely to have been using high-end wearables for longer. In poorer areas, where people may have different underlying health conditions (which often affect digital-biomarker measurements), the algorithm for wearables might be a lot more likely to miss an outbreak. “As they often say in this field, ‘Garbage in, garbage out’, and we still have to better understand whether the data we’ve captured has some garbage in it,” says Mr Wolansky.

If you've already worked your way through your three free articles per month, here is a PDF.

Wednesday, June 1, 2022


 I already knew that Morton Ann Gernsbacher was a genius (see her excellent, open stats classes that use spreadsheets).

So you can imagine how pleased I was to meet her at APS2022. While her talk and message were great, I am here to share one of her presentation resources: Photofunia.

This website creates images that contain your text and words, and I'm pretty amused. I think this could be a low-key way to draw attention to commonly made mistakes and big take-home messages. They work in a Powerpoint but are more attention-grabbing than just using a larger font size or bolding your text.

"Don't forget your post hocs" on a bill board

"A small effect size can be a big deal"

"Jacob Cohen" on a necklace

Tuesday, May 3, 2022

An interactive description of scientific replication


This cool, interactive website asks you to participate in a replication. It also explains how a researcher decision on how to define "randomness" may have driven the main effect of the whole study. There is also a scatter plot and a regression line, talk of probability, replication of a cognitive example.

Long Version: 

This example is equal parts stats and RM. I imagine that it can be used in several different ways:

-Introduce the replication crisis by participating in a wee replication

-Introduce a respectful replication based on interpretation of the outcome variable 

-Data visualization and scatterplots


-Aging research

Okay, so this interactive story from The Pudding is a deep dive into how one researcher decision may be responsible for the study's main effect. Gauvrit et al. (2017) argues that that younger people generate more random responses to several probability tasks. From this, the authors conclude that human behavioral complexity peaks at 25.  

The Pudding authors argue that depending on how you define "randomness", the main effect goes away.

It demonstrates both a replication, a replication in which your students can participate. It also  and happen. Itodify the cut-off criteria for your experimental stimuli. 

Text in image: The study received coverage across dozens of outlets. The headline: A person’s ability to be random peaks around 25 years old and declines after 60.  The researchers made their data and methods public so we explored the idea of making an age guessing game. This unearthed some questions for us, so the story became about the replication crisis; the ongoing concern that it is hard to reproduce many scientific studies. We think the findings of the study are at the mercy of a single decision the researchers made to filter out questionable responses. To us, this meant the participant either misunderstood the instructions, or intentionally subverted the experiment.

I think this has a place in any RM course to introduce The Replication Crisis. Before you get to the screen grab featured above, you have the option to participate in a replication of Gauvrit et al. See below for a screen grab of the instructions for one of the replication tasks: 


Using the blue and black toggle button, you can look at the regression line under the two conditions. The relationship goes away when the different criteria are applied to the When they exclude 
. This is a bonus lesson on interpreting scatterplots/regression, and .

Anyway. Imma using this in both our Professional Development course to introduce the replication crisis, and in my honors statistics class for a "Discussion Day" about replication in science.