Monday, December 16, 2013

The Atlantic's "Congratulations, Ohio! You Are the Sweariest State in the Union"

While it isn't hypothesis driven research data, this data was collected to see which states are the sweariest. The data collection itself is interesting and a good, teachable example. First, the article describes previous research that looked at swearing by state (typically, using publicly available data via Twitter or Facebook). Then, they describe the data collection used for the current research:

"A new map, though, takes a more complicated approach. Instead of using text, it uses data gathered from ... phone calls. You know how, when you call a customer service rep for your ISP or your bank or what have you, you're informed that your call will be recorded? Marchex Institute, the data and research arm of the ad firm Marchex, got ahold of the data that resulted from some recordings, examining more than 600,000 phone calls from the past 12 months—calls placed by consumers to businesses across 30 different industries. It then used call mining technology to isolate the curses therein, cross-referencing them against the state the calls were placed from."

Nice big sample size, archival data, AND data collected in a very naturalistic setting of folks calling and complaining to companies. You could also discuss how this data may be more representative of the average American versus data collected only from folks who use FB or Twitter.

In addition to swearing, they also analyzed the data for courtesy. Way to go, South Carolina!

From The Atlantic