Skip to main content

Posts

Showing posts with the label probability

Deer related insurance claims from State Farm

We should teach with data sets representing ALL of our students. Why? You never know what example will stick in a student's head. One way to get information to stick in is by employing the self-reference effect .  For example, students who grew up in the country might relate to examples that evoke rural life. Like getting the first day of buck season off from school and learning how to watch out for deer on the tree line when you are going 55 MPH on a rural highway. Enter State Farm's data on the likelihood, per state, of a car accident claim due to collision with an animal (not specifically deer, but implicitly deer) . Indeed, my home state of Pennsylvania is the #3 most likely place to hit a deer with your car. State Farm shares its data per state: https://www.statefarm.com/simple-insights/auto-and-vehicles/how-likely-are-you-to-have-an-animal-collision I am also happy to share my version of the data , in which I turned all probability fractions (1 out of 522) into probabili...

An interactive description of scientific replication

TL;DR: This cool, interactive website asks you to participate in a replication. It also explains how a researcher decision on how to define "randomness" may have driven the main effect of the whole study. There is also a scatter plot and a regression line, talk of probability, and replication of a cognitive example. Long Version:  This example is equal parts stats and RM. I imagine that it can be used in several different ways: -Introduce the replication crisis by participating in a wee replication -Introduce a respectful replication based on the interpretation of the outcome variable  -Data visualization and scatterplots -Probability -Aging research Okay, so this interactive story from The Pudding is a deep dive into how one researcher's decision may be responsible for the study's main effect.  Gauvrit et al. (2017 ) argue that younger people generate more random responses to several probability tasks. From this, the authors conclude that human behavioral complexity...

Virtual dice and coin flips via Google

Many stats instructors use dice and/or coin flips to teach their students about distributions, probably, CLT. Here is an alternative to physical coins and dice, in case you are teaching from a distance. Certainly, there are countless other websites that will roll a dice or flip a coin for you, but these simple websites created by Google are intuitive and pretty. Using Google's dice rolling simulator, y ou can roll a standard six-sided die. Or a DnD 20 sided die. Or multiple dice.  I included the link, but all you need to do Google "Roll dice" to get to the website. Google also lets you flip a coin.  This simulator doesn't have any fancy options, and you can get to it simply by Googling, "Flip a coin". 

My favorite real world stats examples: The ones that mislead with real data.

This is a remix of a bunch of posts. I brought them together because they fit a common theme: Examples that use actual data that researchers collected but still manage to lie or mislead with real data. So, lying with facts. These examples hit upon a number of themes in my stats classes: 1) Statistics in the wild 2) Teaching our students to sniff out bad statistics 3) Vivid examples are easier to remember than boring examples. Here we go: Making Graphs Fox News using accurate data and inaccurate charts to make unemployment look worse than it is. Misleading with Central Tendency The mean cost of a wedding in 2004 might have been $28K...if you assume that all couples used all possible services, and paid for all of the services. Also, maybe the median would have been the more appropriate measure to report. Don't like the MPG for the vehicles you are manufacturing? Try testing your cars under ideal, non-real world conditions to fix that. Then get fined by the EPA. Mis...

Roeder's What If God Were A Giant Game Of Plinko?

Roeder, writing for fivethirtyeight.com, has come up with a new way to illustrate the Central Limit Theorem. And it uses Plinko, the beloved The Price is Right game! http://www.businessinsider.com/price-is-right-contestant-plinko-record-2017-5 Well, a variation upon Plinko, featured on the NBC game show The Wall. Their Plinko is much larger and more dramatic and the slots at the bottom go up to $1 million. See below. http://selenahughes.com/1553-2/ How does CLT come into play? Well, the ball is randomly thrown down The Wall. And people jump around and hope for certain outcomes. But what outcome is most likely over time? For the pattern of ending positions to conform to the normal curve. Which it did. See below. https://fivethirtyeight.com/features/what-if-god-were-a-giant-game-of-plinko/ The article itself gets pretty spiritual as the author starts talking about randomness, of the show and of life. You can steal, but cite, the game show as example of CLT as he ...

Logical Fallacy Ref Meme

So, I love me some good statsy memes. They make a brief, important point that sticks in the heads of students. I've recently learned of the Logical Fallacy Ref meme. Here are a couple that apply to stats class:

If your students get the joke, they get statistics.

Gleaned from multiple sources (FB, Pinterest, Twitter, none of these belong to me, etc.). Remember, if your students can explain why a stats funny is funny, they are demonstrating statistical knowledge. I like to ask students to explain the humor in such examples for extra credit points (see below for an example from my FA14 final exam). Using xkcd.com for bonus points/assessing if students understand that correlation =/= causation What are the numerical thresholds for probability?  How does this refer to alpha? What type of error is being described, Type I or Type II? What measure of central tendency is being described? Dilbert: http://search.dilbert.com/comic/Kill%20Anyone Sampling, CLT http://foulmouthedbaker.com/2013/10/03/graphs-belong-on-cakes/ Because control vs. sample, standard deviations, normal curves. Also,"skewed" pun. If you go to the original website , the story behind this cakes has to do w...

Orlin's "What does probability mean in your profession?"

Math with Bad Drawings is a very accurately entitled blog. Math teacher Ben Orlin illustrates math principles, which means that he occasionally illustrates statistical principles. He dedicated one blog posting to probability, and what probability means in different contexts. He starts out with a fairly standard and reasonable interpretation of p :  Then he has some fun. The example below illustrates the gap that can exist between reality and reporting. And then how philosophers handle probability (with high- p statements being "true"). And in honor of the current Star Wars frenzy: And finally...one of Orlin's Twitter followers, JP de Ruiter , came up with this gem about p -values:

TED talks about statistics and research methods

There are a number of TED talks that apply to research methods and statistics classes. First, there is this TED playlist entitled The Dark Side of Data . This one may not be applicable to a basic stats class but does address broader ethical issues of big data, widespread data collection, and data mining. These videos are also a good way of conveying how data collection (and, by extension, statistics) are a routine and invisible part of everyday life. This talk by Peter Donnelly discusses the use of statistics in court cases, and the importance of explaining statistics in a manner that laypeople can understand. I like this one as I teach my students how to create APA results sections for all of their statistical analyses. This video helps to explain WHY we need to learn to report statistics, not just perform statistics. Hans Rosling has a number of talks (and he has been mentioned previously on this blog, but bears being mentioned again). He is a physician and conveys his passion...

Saturday Morning Breakfast Cereal and statistical thinking

Do you follow  Saturday Morning Breakfast Cereal  on  Facebook  or  Twitter ? Zach Weinersmith's hilarious web comic series frequently touches upon science, research methods, data collection, and statistics. Here are some such comics. Good for spiffing up a power point, spiffing up an office door (the first comic adorns mine) or ( per this post ) testing understanding of statistical concepts. http://www.smbc-comics.com/?id=2080...also a good example of the availability bias! http://www.smbc-comics.com/?id=3129 http://www.smbc-comics.com/?id=3435 http://www.smbc-comics.com/?id=1744 http://www.smbc-comics.com/?id=2980 http://smbc-comics.com/index.php?id=4084 http://www.smbc-comics.com/comic/2011-08-05 https://www.smbc-comics.com/index.php?id=4127 http://smbc-comics.com/comic/false-positives https://www.smbc-comics.com/comic/relax

mathisfun.com's Standard Normal Distribution Table

Now, I am immediately suspicious of a website entitled "MathIsFun" (I prefer the soft sell...like promising teaching aids for statistics that are, say, not awful and boring). That being said, t his app. from mathisfun.com  may be an alternative to going cross-eyed while reading z-tables in order to better understand the normal distribution. mathisfun.com With this little Flash app., you can select z-scores and immediately view the corresponding portion of the normal curve (either from z = 0 to your z, up to a selected z, or to the right of that z). Above, I've selected z = 1.96, and the outlying 2.5% of the curve is highlighted.  Now, this wouldn't work for a paper and pencil exam (so you would probably still need to teach students to read the paper table) but I think this is useful in that it allows students to IMMEDIATELY see how z-scores and portions of the of the curve co-vary. 

The Economist's "Unlikely Results"

A great, foreboding video  (here is a link to the same video at YouTube in case you hit the paywall) about the actual size and implication of Type II errors in scientific research. This video does a great job of illustrating what p < .05 means in the context of thousands of experiments. Here is an article from The Economist on the same topic. From TheEconomist

Gerd Gigerenzer on how the media interprets data/science

Gerd "I love heuristics" Gigernezer talking about the misinterpretation of research by the medi a (in particular, misinterpretation of data about oral contraceptives leads to increases in abortions). He argues that such misinterpretation isn't just bad reporting, but unethical.

Andy Field's Statistics Hell

Andy Field is a psychologist, statistician, and author. He created a funny, Dante's Inferno-themed  web site that contains everything you ever wanted to know about statistics. I know, I know, you're thinking, "Not another Dante's Inferno themed statistics web site!". But give this one a try. Property of Andy Field. I certainly can't take credit for this. Some highlights: 1) The aesthetic is priceless. For example, his intermediate statistics page begins with the introduction, "You will experience the bowel-evacuating effect of multiple regression, the bone-splintering power of ANOVA and the nose-hair pulling torment of factor analysis. Can you cope: I think not, mortal filth. Be warned, your brain will be placed in a jar of cerebral fluid and I will toy with it at my leisure." 2) It is all free. Including worksheets, data, etc. How amazing and generous. And, if you are feeling generous and feel the need to compensate him for the website, ...

Discover Magazine's "If a baby can do statistics you have no excuse"

From discovery.com Hahahaha. Like my C-students don't already feel bad enough about themselves, evidence now suggests that babies  have a rudimentary understanding of probability (this summary is also a good example of research methods in developmental psychology).

Newsweek's "What should you really be afraid of?" Update 6/18/15

I use this when introducing the availability heuristic in Intro and Social (good ol' comparison of fatal airline accidents vs. fatal car crashes), but I think it could also be used in a statistics class. For starters, it is a novel way of illustrating data. Second, you could use it to spark a discussion on the importance of data-driven decision making when it comes to public policy/charitable giving. For instance, breast cancer has really good PR, but more women are dying of cardiovascular disease...where should the NSF concentrate its efforts to make the biggest possible impact? Property of Newsweek More of same from Curiosity.com... curiosity.com  https://pbs.twimg.com/media/Bur_W0hCMAAOidE.png