Monday, November 25, 2013

Burr Settles's "On “Geek” Versus “Nerd”"

Settles decided to investigate the difference between being a nerd and being a geek via a pointwise mutual association analysis (using archival data from Twitter). Specifically, he measured the association/closeness between various hashtag descriptors (see below) and the words nerd and geek. Settles provides a nice description of his data collection and analysis on his blog.

A good example of archival data use as well as PMA.

Monday, November 18, 2013

Joshua Katz's visualizations of American dialect data (edited 11/30)

I love American dialects. There might be a Starbuck's in every city, but our regions are still uniquely identifiable by the way we talk. Joshua Katz (graduate student in Statistics) at NCS created graphical representations of data from Cambridge that identified dialectical differences in how Americans speak. Here is a story about the maps and here are the maps themselves. AND: You can even take the Dialect Similarity Quiz that tells you (via map) what parts of the country tend to have language patterns like your own.

I think this demonstrates that 1) graphs are interesting ways of conveying information, 2) data being used to make predictions (of what portion of the U.S. you hail from), and 3) statisticians and social sciences gather interesting and varied data.

Edited to add: The Atlantic has a created a video that contains the audio of folks providing examples of their awesome accents whilst completing the original surve.

Monday, November 4, 2013

The Onion's "Son-Of-A-Bitch Mouse Solves Maze Researchers Spent Months Building"

Ha. This story is a good example of just how frustrating research can be, how well conceived research can go wrong, the ceiling effect, and why you should pre-test measures before going live.

"Above, researchers discuss plans for a new maze, since the prick of a mouse, right, destroyed their chances of making any new discoveries whatsoever about the nature of synaptical response."