Interactive NYC commuting data illustrates distribution of the sampling mean, median

Josh Katz and Kevin Quealy put together a cool interactive website to help users better understand their NYC commute. With the creation of this website, they also are helping statistics instructors illustrate a number of basic statistics lessons.

To use the website, select two stations...


The website returns a bee swarm plot, where each dot represents one day's commuting time over a 16-month sample. 


 So, handy for NYC commuters, but also statistics instructors. How to use in class:

1. Conceptual demonstration of the sampling distribution of the sample mean. To be clear, each dot doesn't represent the mean of a sample. However, I think this still does a good job of showing how much variability exists for commute time on a given day. The commute can vary wildly depending on the day when the sample was collected, but every data point is accurate. 

2. Variability. Here, students can see the variability in commuting time. I think this example is especially useful because everyone can relate to having an unexpectedly short or long commute, and the bee swarm plot does a great job of visualizing it.

3. Distribution shapes. You can ask your students to search through different commutes and look for normally distributed, right skew, left skew, uniform, bee swarm plots. 

4. Introduce your students to the beeswarm plot. Seriously, guys, stop teaching your students about stem and leaf plots and start teaching your student to interpret increasingly popular charts and graphs that do a great job of representing ALL data points.

5. Percentiles. For each route, you receive "Bad Day" and "Good Day" times. The authors defined the "Bad Days" as: 

1/20, or 5 percent of the time. Sounds like the 95th percentile to me. It would also be worth noting the that writers went with the easier to understand 1/20 instead of listing a percentile, probably because they wrote this piece for a popular source.


6. Confidence intervals in action. Note how the time for commute grows as you decrease your tolerance for being late, demonstrating that CI tension between accuracy and precision. 



Comments