Skip to main content

Posts

Showing posts with the label interactive

Interactive NYC commuting data illustrates distribution of the sampling mean, median

Josh Katz and Kevin Quealy p ut together a cool interactive website to help users better understand their NYC commute . With the creation of this website, they also are helping statistics instructors illustrate a number of basic statistics lessons. To use the website, select two stations... The website returns a bee swarm plot, where each dot represents one day's commuting time over a 16-month sample.   So, handy for NYC commuters, but also statistics instructors. How to use in class: 1. Conceptual demonstration of the sampling distribution of the sample mean . To be clear, each dot doesn't represent the mean of a sample. However, I think this still does a good job of showing how much variability exists for commute time on a given day. The commute can vary wildly depending on the day when the sample was collected, but every data point is accurate.  2. Variability . Here, students can see the variability in commuting time. I think this example is e...

Snake Oil Superfoods by InformationIsBeautiful

In my stats classes, we discuss popular claims that have been proven/disproven by research. So, learning styles. Vitamins. One claim we dig into are the wide array of claims made about the health benefits of different foods and folk beliefs about nutrition. But how to get into it? That is such a big field, looking at different foods used for different conditions. Send your students to InformationIsBeautiful's Snake Oil Super foods , which sorted through all of good studies and created an interactive data viz to summarize. For instance, these are three foods, backed by science, for very specific issues: BUT GET THIS: If you scroll over any of them, you get a quick summary of the findings AND a link to the research article. See below for Oats. NOICE. The information isn't limited to slam dunks, either, it fleshes out promising foods and weak links as well. AND...this is great...below the visualization there is all sorts of information on their methodo...

McBee's "Sampling distribution under H0 and critical values"

I think that interactive visualizations are better than lengthy, wordy text books when it comes to illustrating statistical principles. One little GIF or interactive website can do a far better job than text or words. For example: Everything that Kristoffer Magnusson has given us (effect sizes, correlations, etc.). Here is a new tool for explaining critical regions in Intro Stats. Matthew  McBee created an interactive in shinyapps that shows how critical regions change a) depending on test, b) sample size, change of the shape of the distribution. With the ol' t-test, you can show how the critical values move around with degrees of freedom What your t-test critical values looks like at df = 3... ...versus how the those critical values look at df = 80 Also, you can do the same thing but with F curves. Andplusalso: Matt has also created shiny apps to adjust p-values for multiple comparisons , AND another one for calculating p-values based on a test statistics ...

Chase's "How does rent compare to income in each US metropolitan area?"

Positive, interactive linear relationships, y'all. Chase, of Overflow Data , created a scatter plot that finds that as income goes up, so does rent. Pretty intuitive, right? I think intuitive examples are good for students. Cursor over the dots to see what metro area each dot represents, or use the search function to find your locale and personalize the lesson a wee bit for your students.

Mathisfun.com's Least Squared Error calculator

Mathisfun.com bills this as a Least Squared Error calculator , but I don't think it is a calculator. I think it is a nice visual aid that demonstrates how the regression line/equation change as your data changes. The static photo below doesn't do this interactive website justice. You can drag and drop any of the dots on the scatter plot and watch as the regression line and regression line equation are recalculated to best predict Y based on X. It doesn't explicitly show the math going on behind the scenes, but it is a nice compliment to your LSE lecture. https://www.mathsisfun.com/data/least-squares-calculator.html

Dozen of interactive stats demos from @artofstat

This website is associated with Agresti, Franklin, and Klinenberg's text Statistics, The Art and Science of Learning from Data ( @artofstat ), and there are dozens of great interactives to share with your statistics students. Similar and useful interactives exist elsewhere, but it is nice to have such a thorough, one-stop-shop of great visuals. Below, I have included screengrabs of two of their interactive tools. They also explain chi-square distributions, central limit theorem, exploratory data analysis, multivariate relationships, etc. This interactive about linear regression let's you put in your own dots in the scatter plot, and returns descriptive data and the regression line, https://istats.shinyapps.io/ExploreLinReg/.  Show the difference between two populations (of your own creation), https://istats.shinyapps.io/2sample_mean/

"Draw My Data" and a bunch of other stuff for teaching correlation.

Robert Grant's website Draw My Data  provides you with a blank scatter plot graph. You add your dots, and the website generates M and SD for your X and Y, as well as r for the relationship between X and Y. It even generates a data set for download. My Twitter handle, @notawful, has an r of -.485. Via http://robertgrantstats.co.uk/drawmydata.html Great for illustrating a specific kind of relationship (positive, negative, etc.) to your students. Also allows for much goofiness, like Alberto Cairo, who plotted a T-rex and went viral. And then the T-rex plot, and a bunch of other plots, were used to create an animated, updated version of Anscombe's Quartet . And that was presented at a conference by Matejka & Fitzmaurice. https://www.autodeskresearch.com/publications/samestats So, lots of stats goodness here. You can let your students play with Draw Your Data or use that website to generate data sets for use in class. You can also use the dino data to illustr...

Climate Central's The First Frost is Coming Later

So, this checks off a couple of my favorite requisites for a good teaching example: You can personalize it, it is contemporary and applicable, it illustrates a few different sorts of statistics.  Climate Central wrote this article about first frost dates, and how those dates, and an increasing number of frost-free days, create longer growing seasons.  The overall article is about how frosty the US is becoming as the Earth warms. They provide data about the first frost in a number of US cities. It even lists my childhood hometown of Altoona, PA, so I think there is a pretty large selection of cities to choose from. Below, I've included the screen grab for my current home, and the home of Gannon University, Erie, PA. The first frost date is illustrated with a line chart, but the chart also includes the regression line. Data for frosty, chilly Erie, PA The article also presents a chart that shows how frost is related to the length of the growing season in t...

Yau's "Divorce and Occupation"

Nathan Yau , writing for Flowing Data , provides a good example of correlation, median, and correlation not equaling causation in his story, " Divorce and Occupation ". Yau looked at the relationship between occupation and divorce in a few ways. He used one of variation upon the violin plot to illustrate how each occupation's divorce rate falls around the median divorce rate. Who has the lowest rate? Actuaries. They really do know how to mitigate risk. You could also discuss why median divorce rate is provided instead of mean divorce rate. Again, the actuaries deserve attention as they probably would throw off the mean. https://flowingdata.com/2017/07/25/divorce-and-occupation/ He also looked at  how salary was related to divorce, and this can be used as a good example of a linear relationship: The more money you make, the lower your chances for divorce. And an intuitive exception to that trend? Clergy members.  https://flowingdata.com/2017/07/25/divorce...

Sonnad and Collin's "10,000 words ranked according to their Trumpiness"

I finally have an example of Spearman's rank correlation to share. This is a political example, looking at how Twitter language usage differs in US counties based upon the proportion of votes that Trump received. This example was created by  Jack Grieves , a linguist who uses archival Twitter data to study how we speak. Previously, I blogged about his work that analyzed what kind of obscenities are used in different zip codes in the US . And he created maps of his findings, and the maps are color coded by the z-score for frequency of each word. So, z-score example. Southerners really like to say "damn". On Twitter, at least. But on to the Spearman's example. More recently, he conducted a similar analysis, this time looking for trends in word usage based on the proportion of votes Trump received in each county in the US. NOTE: The screen shots below don't do justice to the interactive graph. You can cursor over any dot to view the word as well as the cor...

Chris Wilson's "The Ultimate Harry Potter Quiz: Find Out Which House You Truly Belong In"

Full disclosure: I have no chill when it comes to Harry Potter. Despite my great bias, I still think this pscyometrically-created (with help from psychologists and Time Magazine's Chris Wilson!) Hogwart's House Sorter is a great example for scale building, validity, descriptive statistics, electronic consent, etc. for stats and research methods. How to use in a Research Methods class: 1) The article details how the test drew upon the Big Five inventory. And it talks smack about the Myers-Briggs. 2) The article also uses simple language to give a rough sketch of how they used statistics to pair you with your house. The "standard statistical model" is a regression line, the "affinity for each House is measured independently", etc. While you are taking the quiz itself, there are some RM/statsy lessons: 3) At the end of the quiz, you are asked to contribute some more information. It is a great example of a leading response options ...

Daniel's "Most timeless songs of all time"

This article, written by Matt Daniels  for The Pudding , allows you to play around with a whole bunch of Spotify user data in order to generate visualizations of song popularity over time. You can generate custom visualizations using the very interactive sections on this website. For instance, there is a special visualization that allows you to finally quantify the Biggie/Tupac Rivalry. So, data and pop culture are my two favorite things. I could play with these different interactive pieces all day long. But there are also some specific ways you could use this in class. 1) Generate unique descriptive data for different musicians and then ask you students to create visualizations using the software of your choosing. Below, I've queried Dixie Chicks play data. Students could enter their own favorite artist. Note: They data only runs through 2005. 2) Sampling errors: Here is a description of the methodology used for this data: Is this representative of all data...

Our World in Data website

Our World in Data is an impressive, creative-commons licensed site managed by Max Roser . And it lives up to its name. The website provides all kinds of international data, divided by country, topic (population, health, food, growth & inequality, work, and life, etc.), and, when available, year. It contains its own proprietary data visualizations, which typically feature international data for a topic. You can customize these visualizations by nation. You can also DOWNLOAD THE DATA that has been visualized for use in the classroom. Much of the data can be visualized as a map and progress, year by year, through the data, like this data on international human rights. https://ourworldindata.org/human-rights/  https://ourworldindata.org/human-rights/ There are also plenty of topics of interest to psychologists who aren't teaching statistics. For example, international data on suicide: Data for psychology courses...https://ourworldindata.org/suicide/ Work...

Wilson's "Find Out What Your British Name Would Be"

Students love personalized, interactive stuff.  This website from Chirs Wilson over at Time allows your American students to enter their name and they recieve their British statistical doppleganger name in return. Or vice versa. And by statistical doppleganger, I mean that the author sorted through name popularity databases in the UK and America. He then used a Least Squared Error model in order to find strong linear relationships for popularity over time between names. How to use in class: Linear relationship LSE Trends over time

Johnson & Wilson's The 13 High-Paying Jobs You Don’t Want to Have

This is a lot of I/O and personality a little bit of stats. But it does demonstrate correlation and percentiles, and it is interactive. For this article  from Time, Johnson and Wilson used participant scores on a very popular vocational selection tool, the Holland Inventory (sometimes called the RAISEC), and participant salary information to see if there is a strong relationship between salary and personality-job fit. There is not. How to use in class: -Show your students what a weak correlation looks like when expressed via scatter plot. Seriously. I spend a lot of time looking for examples for teaching statistics. And there are all sorts of significant positive and negative correlation examples out there . But good examples of non-relationships are a lot rarer. -If you teach I/O, this fits nicely into personality-job fit lecture. If you don't teach I/O but are a psychologist, this still applies to your field and may introduce your students to the field of I/O. ...

Pew Research's "The art and science of the scatterplot"

Sometimes, we need to convince our students that taking a statistics class changes the way they think for the better. This example demonstrates that one seemingly simple skill, interpreting a scatter plot, is tougher than it seems. Pew Research conducted a survey on scientific thinking in America ( here is a link to that survey ) and they found that only 63% of American adults can correctly interpret the linear relationship illustrated in the scatter plot below. And that 63% came out a survey with multiple-choice responses! How to use in class: -Show your students that a major data collection/survey firm decided that interpreting statistics was an appropriate question on their ten-item quiz of scientific literacy. -Show your students that many randomly selected Americans can't interpret a scatter plot correctly. And for us instructors: -Maybe a seemingly simple task like the one in this survey isn't as intuitive as we think it is!

Pew Research's "Growing Ideological Consistency"

This interactive tool from Pew research illustrates left and right skew as well as median and longitudinal data. The x-axis indicates how politically consistent (as determined by a survey of political issues) self-identified republicans and democrats are across time. Press the button and you can animate data, or cut up the data so you only see one party or only the most politically active Americans. http://www.people-press.org/2014/06/12/section-1-growing-ideological-consistency/#interactive The data for both political part goes from being normally distributed in 1994 to skewed by 2014. And you can watch what happens to the median as the political winds change (and perhaps remind your students as to why mean would be the less desirable measure of central tendency for this example). I think it is interesting to see the relative unity in political thought (as demonstrated by more Republicans and Democrats indicating mixed political opinions) in the wake of 9/11 but more politicall...

Kristoffer Magnusson's "Interpreting Confidence Intervals"

I have shared Kristoffer Magnusson's fantastic visualizations of statistical concepts here previously ( correlation , Cohen's d ). Here is another one that helps to explain confidence intervals , and how the likelihood of an interval containing true mu varies based on interval size as well as the size of the underlying sample. The site is interactive in two ways. 1) The sliding bar at the top of the page allows you to adjust the size of the confidence interval, which you can read in the portion of the page labeled "CI coverage %" or directly above the CI ticker. See below. 2) You can also change the n-size for the samples the simulation is pulling. The site also reports back the number of samples that include mu and the number of samples that miss mu (wee little example for Type I/Type II error). How to use it in class: Students will see how intervals increase and decrease in size as you reset the CI percentage. As the sample size increases, the range ...

Matt, Rali & Rhonda's Statistical Test Flowchart.

Take a look at this interactive, statistical decision making flow chart. I think that almost every statistics text includes a flow chart, but the interactive piece of this, and its ability to immediately provide the reader with information on the appropriate analysis AND software assistant is something your students can't get from paper versions of same. The flow chart is based on Andy Field's work. I discovered this tool via Reddit. I'm including that Reddit thread because the person that created the thread (commentor4) states that they also created the flow chart. So, you are lead through a series of questions (read this from the bottom up). After you provide the necessary information, the page provides you with a quick definition of the test you should conduct as well as links to instruction using popular statistical packages.

Wilson's "America’s Mood Map: An Interactive Guide to the United States of Attitude"

Here is a great example of several different topics, featuring an engaging, interactive m ap created by Time magazine AND using data from a Journal of Personality and Social Psycholog y article . Essentially, the authors of the original article gave the Big Five personality scale to folks all over the US. They broke down the results by state. Then Time created an interactive map of the US in order to display the data. http://time.com/7612/americas-mood-map-an-interactive-guide-to-the-united-states-of-attitude/ How to use in class: