Spotify generates gobs of data about music.
Most people have seen the end-of-the-year data Spotify generates for each user about their listening patterns. Most people don't know that Spotify also generates a lot of data about individual songs. Some of it is straightforward: tempo, genre, length. However, Spotify also has its own niche way of quantifying songs: Danceability. Accousticness. Here is a whole list of their variables and descriptions from researchers at CMU: https://www.stat.cmu.edu/capstoneresearch/315files_s23/team23.html
What does this mean for a stats teacher? You have access to highly personalizable data sets, rooted in music, with gobs and gobs of variables for each song...or artist...or album...or year of release...or genre (like, so many ways to divide up your data).
For instance, I created a data set with Spotify data for 1989 and 1989 (Taylor's Version) to teach paired t-tests. How do Taylor's re-recordings compare to the originals?
This data is freely available for download. If you are fancy, you can negotiate their API all by yourself. If you aren't fancy (I'm not fancy), you can use Exportify to get at this data.
https://exportify.net/#playlists |
Which artist's sounds have changed the most if you compare their first album to their most recent album?
Which K-pop groups differ from one another, and how?
Which artists have the most variability on a single album?
What about soundtracks for horror movies versus romances? What about the top ten songs from 2003 versus 2023?
Not only are there endless questions, but I imagine you could come up with data for any kind of test you would ever want your stats babies to learn.
PS: Hey! If you like this idea and would love a whole stats textbook from the brain of the person who came up with this idea, sign up for more information about my forthcoming book here: https://seagull.wwnorton.com/l/710463/2023-10-26/2tp3nt
Comments
Post a Comment