Skip to main content

Posts

Showing posts with the label variability

Why measures of variability matter: Average age of death in The Olden Days

Alright, this is a 30-second long example for a) bimodal distributions and b) why measures of variability matter when we are trying to understand a mean. And that mean is...AGE OF DEATH. My inspiration for this tweet is: I’m just a girl, standing in front of the internet, asking it to understand that historical life expectancies doesn’t mean most people died at 45 but rather that infant mortality was super high and pulled down the average. — Angelle Haney Gullett (@CityofAngelle) January 12, 2022 Gullett refers here to the commonly held belief that if the mean life span Back In The Day was 45, or thereabout, everyone was dying around 45. NOT SO. Why? The short answer is no. Broadly speaking, there were two choke point of human mortality. Younger than 5, and again around 50. If you made it through those, barring accidents, you likely had what was a normal lifespan of ~65-70 years. And this is why I’m no fun at parties 😂 — Angelle Haney Gullett (@CityofAngelle) January 12, 2022 OK. An...

The Pudding's Colorism

Malaika Handa , Amber Thomas , and Jan Diehn created a beautiful, interactive website, Colorism in High Fashion . It used machine learning to investigate "colorism" at Vogue magazine. Specifically, it delves into the differences, over time, in cover model color but also how lighting and photoshopping can change the color of the same woman's skin, depending on the photo. There are soooo many ways to use this in class, ranging from machine learning, how machine learning can refine old psychology methodology, to variability and within/between-group differences. Read on: 1. I'm a social psychologist. Most of us who teach social psychology have encountered research that uses magazine cover models as a proxy for what our culture emphasizes and values ( 1 , 2 , 3 ). Here, Malaika Handa, Amber Thomas, and Jan Diehn apply this methodology to Vogue magazine covers. And they take this methodology into the age of machine learning by using k-means cluster and pixels to deter...

Interactive NYC commuting data illustrates distribution of the sampling mean, median

Josh Katz and Kevin Quealy p ut together a cool interactive website to help users better understand their NYC commute . With the creation of this website, they also are helping statistics instructors illustrate a number of basic statistics lessons. To use the website, select two stations... The website returns a bee swarm plot, where each dot represents one day's commuting time over a 16-month sample.   So, handy for NYC commuters, but also statistics instructors. How to use in class: 1. Conceptual demonstration of the sampling distribution of the sample mean . To be clear, each dot doesn't represent the mean of a sample. However, I think this still does a good job of showing how much variability exists for commute time on a given day. The commute can vary wildly depending on the day when the sample was collected, but every data point is accurate.  2. Variability . Here, students can see the variability in commuting time. I think this example is e...

Crash Course: Statistics

Crash course website produces brief, informative videos. They are a mix of animation and live action, and cover an array of topics, including statistics. This one is all about measures of central tendency: Here is the listing under their #statistics tag , which includes videos about correlation/causation, data visualization, and variability. And, you know what? This is just a super cool web site, full stop. Here are all of their psychology videos .

Why range is a lousy measure of variability

Nate Silver and Allison McCann's "How to Tell Someone’s Age When All You Know Is Her Name"

Nate Silver and Allison McCann (reporting for Five Thirty Eight, created graphs displaying baby name popularity over time.  The data and graphs can be used to illustrate bimodality, variability, medians, interquartile range, and percentiles. For example, the pattern of popularity for the name Violet illustrates bimodality and illustrates why measures of central tendency are incomplete descriptors of data sets: "Other names have unusual distributions. What if you know a woman — or a girl — named Violet? The median living Violet is 47 years old. However, you’d be mistaken in assuming that a given Violet is middle-aged. Instead, a quarter of Violets are older than 78, while another quarter are younger than 4. Only about 4 percent of Violets are within five years of 47." Relatedly, bimodality (resulting from the current trend of giving classic, old-lady names to baby girls) can result in massive variability for some names... ...versus trendy baby names th...