Skip to main content

Posts

Showing posts with the label interactive

Data USA

Data USA draws upon various federal data sources in order to generate visualizations about cities and occupations in the US. And it provides lots of good examples of simple, descriptive statistics and data visualizations. This website is highly interactive and you can query information about any municipality in the US. This creates relevant, customized examples for your class. You can present examples of descriptive statistics using the town or city in which your college/university/high school is located or you could encourage students to look up their own hometowns. Data provided includes job trends, crime, health care, commuting times, car ownership rates...in short, all sorts of data. Below I have included some screen shots for data about Erie, PA, home of Gannon University: The background photo here is from the Presque Isle, a very popular state park in Erie, PA. And, look, medians!

Rich, Cox, and Bloch's "Money, Race and Success: How Your School District Compares"

If you are familiar with financial and racial disparities that exist in the US, you can probably guess where this article is going based on its title. Kids in wealthy school districts do better in school than poor kids. Within each school district, white kids do better than African American and Latino kids. How did they get to this conclusion? For every school district in the US, the researchers used the Stanford Educational Data Archive to figure out 1) the median household income within each school district and 2) the grade level at which the students in each school district perform (based on federal test performance). This piece also provides multiple examples for use within the statistics classroom. Highly sensitive examples, but good examples none the less. -Most obviously, this data provides an easy-to-follow example of linear relationships and correlations. The SES:school performance relationship is fairly intuitive and easy to follow (see below) From the New Yor...

Ben Schmidt's Gendered Language in Teacher Reviews

Tis the season for the end of semester teaching evaluations. And Ben Schmidt has created an interactive tool that demonstrates gender differences in these evaluations.  Enter in a word, and Schmidt's tool returns to you how frequently the word is used in Rate Your Professor  evaluations, divided up by gender and academic discipline. Spoiler alert: Men get higher ratings for most positive attributes! ...while women get higher ratings for negative attributes.  Out of class, you can use this example to feel sad, especially if you are a female professor and up for tenure. In class, this leads to obvious discussions about gender and perception and interpersonal judgments. You can also use it to discuss why the x- and y-axes were chosen. You can discuss the archival data analysis used to generate these charts. You can discuss data mining. You can discuss content analysis. You can also discuss between-group differences (gender) versus within-group differences (acade...

Pew Research Center's "The strong relationship between per capita income and internet access, smartphone ownership"

This finding is super-duper intuitive: A positive, strong correlation exists between national per capita income and rates of internet access and smartphone ownership within that nation. Because it is intuitive, it makes a good example for your class when you teach correlation to your baby statisticians. This graph is  more engaging than your average graph because the good people at Pew made it interactive. You can see which country is represented by which dot. You can also see regional trends as the countries are color-coded by continent/region. For more context and information on this survey, see this more extensive report on the relationship between smartphone/internet access and economic advancement . This report further breaks down technology usage by education level, age, individual income, etc. This data is also useful for demonstrating the distribution of wealth in the world and variability that exists among countries in the same region/on the same continent,

Dr. Mages' "APA Exposed: Everything you wanted to know about APA formatting but were afraid to ask."

Teaching undergraduates APA style is not fun. It is not fun for teachers. It is not fun for students. However, I think that the more tools that we, the teachers, have in order to convey the rules of APA style, the more likely we are to find something that finally sticks for our students. This week, I offer one such tool created by Dr. Wendy K. Mages. Dr. Mages created an online, self-paced, free Powerpoint presentation that teaches the essentials of APA style. Lessons are presented in a PowerPoint-esque format with a voice-over (as well as a transcript) I like that Dr. Mages includes some of her own experiences grading students papers in order to keep current students from making frequent mistakes that Dr. Mages has encountered. She also offers plenty of original examples and uses appropriate Powerpoint animations/highlighting to engage the viewer.

"Guess the Correlation" game

Found this gem, "Guess the Correlation" , via the subreddit r/statistics . The redditor who posted this resource (ow241) appears to be the creator of the website. Essentially, you view different scatter plots and try to guess r . Points are rewarded or taken away based on how close you are to true  r . The game tallies your average amount of error as well. It is way more addictive than it sounds. I think that accuracy increases with time and experience. True r for this one was .49. I guess .43, which isn't so bad. I think this is a good way for statistics instructors to procrastinate. I think it is also a good way to help your students build a more intuitive ability to read scatter plots and predict the strength of linear relationships.

NFL.com's Football Freakanomics

EDIT: All of this content appears to have been removed from NFL.com. If anyone has any luck finding it, please email me at hartnett004@gannon.edu The NFL and the statistics folks over at Freakonomics got together and made some...learning modules? Let's call them learning modules. They are interactive websites that teach users about very specific questions related to football (like home field advantage , instances when football player statistics don't tell the whole story about a player/team , whether or not firing a head coach improves a failing team , the effects of player injury on team success , etc.) and then answer these questions via statistics. Most of the modules include interactive tables, data, and videos (featuring the authors of Freakanomics) in order to delve into the issue at hand. For example: The Home Field Advantage : This module features a video, as well as a interesting interactive map that illustrates data about the exact sleep lost experienced by ...

Barry-Jester, Casselman, & Goldstein's "Should prison sentences be based on crimes that haven't been committed yet?"

This article describes how the Pennsylvania Department of Corrections is using risk assessment data in order to predict recidivism, with the hope of using such data in order to guide parole decisions in the future. So, using data to predict the future is very statsy, demonstrates multivariate modeling, and a good example for class, full stop. However, this article also contains a cool interactive tool, entitled "Who Should Get Parole?" that you could use in class. It demonstrates how increasing/decreasing alpha and beta changes the likelihood of committing Type I and Type II errors. The tool allows users to manipulate the amount of risk they are willing to accept when making parole decisions. As you change the working definition of a "low" or "high" risk prisoner, a visualization will startup, and it shows you whether your parolees stay out of prison or come back. From a statistical perspective, users can adjust the definition of a low, medium, and h...

Barry-Jester's "What A Bar Graph Can Tell Us About The Legionnaires’ Outbreak In New York" + CDC learning module

Statistics aficionados over at FiveThirtyEight applied statistics (specifically, tools used by epidemiologists) to the Summer of 2015  outbreak of Legionnaires' Disease  in New York. This story can be specifically used in class as a way of discussing how simple bar graphs can be modified as to display important information about the spread of disease. This news story also includes a link to a learning module  from the CDC. It takes the user through the process of creating an Epi curve. Slides 1-8 describe the creation of the curve, and slides 9-14 ask questions and provide interactive feedback that reinforces the lesson about creating Epi curves. Graphs are useful for conveying data, but even one of our out staples, the bar graph, can be specialized as they share information about the way that disease spread. 1) Demonstrates statistics being used in a field that isn't explicitly statistics-y. 2) A little course online via the CDC for your students to learn to...

U.S. Holocaust Mueseum's "Deadly medicine, creating the master race" traveling exhibit

Alright. This teaching idea is pretty involved. It is bigger than any one instructor and requires interdepartmental effort as well as support from The Powers that Be at your university. The U.S. Holocaust Museum hosts a number of  traveling exhibits . One in particular, " Deadly Medicine: Creating the Master Race ", provides a great opportunity for the discussions of research ethics, the protection and treatment of human research subjects, and how science can be used to justify really horrible things. I am extraordinarily fortunate that Gannon University's Department of History (with assistance from our Honors program as well as College of the Humanities, Education, and Social Sciences) has worked hard to get this exhibit to our institution during the Fall 2015 semester. It is housed in our library through the end of October. How I used it in my class: My Honors Psychological Statistics class visited the exhibit prior to a discussion day about research ethics. In...

Aschwanden's "Science is broken, it is just a hell of a lot harder than we give it credit for"

Aschwanden (for fivethirtyeight.com) did an extensive piece that summarizes that data/p-hacking/what's wrong with statistical significance crisis in statistics. There is a focus on the social sciences, including some quotes from Brian Nosek regarding his replication work. The report also draws attention to  Retraction Watch  and Center for Open Science as well as retractions of findings (as an indicator of fraud and data misuse). The article also describes our funny bias of sticking to early, big research findings even after those research findings are disproved (example used here is the breakfast eating:weight loss relationship). The whole article could be used for a statistics or research methods class. I do think that the p-hacking interactive tool found in this report could be especially useful illustration of How to Lie with Statistics. The "Hack your way to scientific glory" interactive piece demonstrates that if you fool around enough with your operationalized...

Kristopher Magnusson's "Understanding the t-distribution and its normal approximation"

Once again, Kristopher Magnusson has combined is computer programming and statistical knowledge to help illustrate statistical concepts . His latest  interactive tool allows students to view the t-curve for different degrees for freedom. Additionally, students can view error rates associated with different degrees of freedom. Note that the critical region is one-tailed with alpha set at .05. If you cursor around the critical region, you can set the alpha to .025 to better illustrate a two-tailed test (in terms of the critical region at which we declare significance).  Error rates when n < 30 Error rates when n > 30 This isn't the first time Kristopher's interactive tools have been featured on this blog! He has also created websites dedicated to explaining effect size , correlation , and other statistical concepts .

Free online research ethics training

Back in the day, I remember having to complete an online research ethics course in order to serve as an undergraduate research assistant at Penn State. I think that such training could be used as an exercise/assessment in a research methods class or an advanced statistics class. NOTE: These examples are sponsored by the American agencies and, thus, teach participants about American laws and rules. If you have information about similar training in other countries (or other free options for American researchers), please email me and I will add the link. Online Research Ethics Course from the U.S. Health and Human Service's Office of Research Integrity. Features: Six different learning modules, each with a quiz and certificate of completion. These sections include separate quizzes on the treatment of human and animal test subjects. Other portions also address ethical relationships between PIs and RAs and broader issues of professional responsibility when reporting results. ...

TED talks about statistics and research methods

There are a number of TED talks that apply to research methods and statistics classes. First, there is this TED playlist entitled The Dark Side of Data . This one may not be applicable to a basic stats class but does address broader ethical issues of big data, widespread data collection, and data mining. These videos are also a good way of conveying how data collection (and, by extension, statistics) are a routine and invisible part of everyday life. This talk by Peter Donnelly discusses the use of statistics in court cases, and the importance of explaining statistics in a manner that laypeople can understand. I like this one as I teach my students how to create APA results sections for all of their statistical analyses. This video helps to explain WHY we need to learn to report statistics, not just perform statistics. Hans Rosling has a number of talks (and he has been mentioned previously on this blog, but bears being mentioned again). He is a physician and conveys his passion...

Chris Wilson's "Find out what your name would be if you were born today"

This little questionnaire will provide you with a) the ordinal value of your name for your sex/year of birth and then generate b) a bunch of other names from various decades that share your name's ordinal. Not the most complex example, but it does demonstrate ordinal data. Me and all the other 4th most popular names for women over the years. Additionally, this data is pulled from Social Security, which opens up the conversation for how we can use archival data for...not super important interactive thingies from Time Magazine? Also, you could pair up this example with other interactive ways of studying baby name data ( predicting a person's age if you know their name , illustrating different kinds of data distributions via baby name popularity trends ) in order to create a themed lesson that would correspond nicely to that first/second chapter of most undergraduate stats textbooks in which you learn about data distribution and different types of data.

Pew Research Center's "Major Gaps Between the Public, Scientists on Key Issues"

This report from Pew  highlights the differences in opinions between the average American versus members of the American Association for the Advancement of Science (AAAS). For various topics, this graph reports the percentage of average Americans or AAAS members that endorse each science related issues as well as the gap between the two groups. Below, the yellow dots indicate the percentage of scientists that have a positive view of the issue and the blue indicate the same data for an average American. If you click on any given issue, you see more detailed information on the data. In addition to the interactive data, this report by Funk and Rainie summarizes the main findings. You can also access the original report of this data  (which contains additional information about public perception of the sciences and scientists). This could be a good tool for a research methods/statistics class in order to convince students that learning about the rigors of the scientif...

Khan Academy's #youcanlearnanything

Khan has been providing high-quality videos explaining...indeed...everything for a while now. Among everything are Probability and Statistics. Recently, they reorganized their content and added assessment tools as part of their #youcanlearnanything campaign in order to create self-paced lessons that are personalized to the user and include plenty of videos (of course) and personalized quizzes and feedback. 1) It requires the creation of a free account and selection of a learning topic (the screen shots below are from the Statistics and Probability course). 2) When you start a topic, you take pre-test to assess your current level. This assessment covers simple chart reading, division, and multiplication required for more advanced topics. If you struggle with this, Khan provides you with more material to improve your understanding of these topics. 3) After you complete the assessment, you receive your lesson plan. It includes the topic you select plus an additional introductory ...

Chemi & Giorgi's "The Pay-for-Performance Myth"

UPDATE: The link listed below is currently not working. I've talked to Ariana Giorgi about this, and she is working to get her graph up and running again via Bloomberg. She was kind enough to provide me with a provide me with alternate URLs to the interactive scatter plot  as well as a link to the original text of the story . Ariana is doing a lot of interesting work with data visualizations, follow her on Twitter or hit up her website . _______________________________________________________________________________ This scatter plot (and accompanying news story from Bloomberg News)  demonstrates what a non-existent linear relationship looks like. The data plots CEO pay on the x-axis and stock market return for that CEO's organization on the y-axis. I could see where this graph would also be useful in an I/O course in discussions of (wildly unfair) compensation, organizational justice, etc. http://www.bloomberg.com/bw/articles/2014-07-22/for-ceos-correlation...

Pew Research's "Global views on morality"

Pew Research went around the globe and asked folks in 40 different countries if a variety of different behaviors qualified as "Unacceptable", "Acceptable", or "Not a moral issue". See below for a broad summary of the findings. Summary of international morality data from Pew The data on this website is highly interactive...you can break down the data by specific behavior, by country, and also look at different regions of the world. This data is a good demonstration of why graphs are useful and engaging when presenting data to an audience. Here is a summary of the data from Pew.  It nicely describes global trends (extramarital affairs are largely viewed as unacceptable, and contraception is widely viewed as acceptable). How you could use this in class. 1) Comparison of different countries and beliefs about what is right, and what is wrong. Good for discussions about multiculturalism, social norms, normative behaviors, the influence of religion ...

Kristopher Magnusson's "Interpreting Cohen's d effect size"

Kristopher Magnusson (previously featured on this blog for his interactive illustration of correlation ) also has a helpful illustration of effect size . While this example probably has some information that goes beyond an introductory understanding of effect size (via Cohen's d ) I think this still does a great job of illustrating how effect size measures, essentially, the magnitude of the difference between groups (not how improbably those differences are). See below for a screen shot of the tool. http://rpsychologist.com/d3/cohend/, created by Kristopher Magnusson