Skip to main content

Posts

Showing posts with the label scatter plots

Our World in Data's deep dive into human height. Examples abound.

Stats nerds: I'm warning your right now. This website is a rabbit hole for us, what with the interactive, customizable data visualizations. Please don't click on the links below if you need to grade or be with your kids or drive.  At a recent conference presentation, I was asked where non-Americans can find examples like the ones I share on my blog. I had a few ideas (data analytic firms located in other countries, data collected by the government), but wanted more from my answer.  BUT...I recently discovered this interactive from Our World in Data. It visualizes international data on human height, y'all  with so many different examples throughout. I know height data isn't the sexiest data, but your students can follow these examples, they can be used in a variety of different lessons, and you can download all of the data from the beautiful interactive charts. 1. Regressions can't predict forever. Trends plateau.  I'm using this graph to as an example of how a r...

chartr's "Speed or Accuracy? It's hard to do both in fast food drive-thrus"

Sometimes, you just need a new, simple example for a homework question or a class warm-up.   I eyeballed and entered the   data here  ( r   = -.55). Enjoy. I use this little example to explain to use the regression formula to make a prediction. Here are my slides .

Correlation example: Taco Bell and mortality by state...don't run for the border!

Many thanks to my colleague, Andrew Caswell, for sharing this Reddit post with me: https://www.reddit.com/r/dataisbeautiful/comments/s75sm7/oc_us_life_expectancy_vs_of_taco_bell_locations/ So, this alone is an excellent example of correlation and the third variable problem. But...more delightfully, the Redditor who created this graph also shared where he found this data (https://www.nicerx.com/fast-food-capitals/, https://worldpopulationreview.com/state-rankings/life-expectancy-by-state). BETTER STILL: I downloaded and organized all of the fast-food data and mortality data and put it in one spreadsheet for you all. Do All The Correlations! Teach your students about Bonferroni corrections! Figure out the fast-food restaurant that correlates the most strongly with mortality!   PS: Did you know that there is an option to download data from a website in Excel?  The fast-food data was presented in an embedded, scrolly table, and that Excel option made it easy-peasy to do...

Flowingdata's Car Costs vs. Emissions story

FlowingData shared some interesting info on how much cars cost version their environmental footprint .  TL;DR: Low emission cars tend to be cheaper in the long run. Hooray for the free market! The data is also available via the New York Times , along with a much more in-depth conversation about the actual cost of high/low emission cars, but it is behind a paywall. The original data, presented in a fun (nerd fun) interactive website , is available here.  How to use it in class: 1) It's a correlation! Each car model is a dot with two related data points: Average cost per month and average carbon dioxide emissions per mile.  2) It's Simpson's Paradox! Note how electric cars (yellow cloud all have similar emissions, but the average cost per month varies. Same for Diesel cars. Overall, you still see the positive correlation in the data, but if you break it down by class of car, the correlation isn't present for every level. 

NYT's "Is It Safer to Visit a Coffee Shop or a Gym?"

Katherine Baicker ,  Oeindrila Dube ,  Sendhil Mullainathan ,  Devin Pope,  and  Gus Wezerek created an interactive, data-driven piece for NYT . It provides a new perspective on how we should proceed with re-opening businesses during the COVID-19 pandemic. They argue that we must consider 1) how long people linger in different types of stores, 2) how often they visit these stores, 3) the square footage of the stores, and 4) the amount of human interaction/surface contact associated with how we shop at different stores.  How to use this in class:    1) Show your students how data can inform real-life problems. Or crises, like how to safely re-open stores during COVID-19. 2) Show your students how data can be used in creative ways to solve problems. The present argument uses cellphone location data. 3) Show your students data viz in real life: Here, scatterplots that really improve the #scicomm potential of this piece. 4) Show your students the rese...

Great Tweets about Statistics

I've shared these on my Twitter feed, and in a previous blog post dedicated to stats funnies. However,  I decided it would be useful to have a dedicated, occasionally updated blog post devoted to Twitter Statistics Comedy Gold. How to use in class? If your students get the joke, they get a stats concept. *Aside: I know I could have embedded these Tweets, but I decided to make my life easier by using screenshots. How NOT to write a response option.  Real life inter-rater reliability Scale Development Alright, technically not Twitter, but I am thrilled to make an exception for this clever, clever costume: This whole thread is awesome...https://twitter.com/EmpiricalDave/status/1067941351478710272 Randomness is tricky! And not random! ...