I intended for this to be a post about the singer vocabulary.
It is still that, but it is also a post about using GenAI to grab data from an image. I mean, you can use Excel to do the same thing, but GenAI is a lot easier.
Here we go. It starts with the Word Tips website, which helps you solve your crossword puzzles and Wordle. This website also has a blog dedicated to words. One such blog post explored which singers have the largest vocabularies, as measured by the number of unique words in their lyrics.
Their blog post compared music legends to newer talent. There are a ton of fun data visualizations on the website; go check it out.
Since I teach college students, I decided to concentrate on the musicians my students listen to:
In and of itself, this image serves as an example of bar graphs, good data visualization, and proper use of "buckets".
However, I figured we could find a way to use the raw data in class. Create your own data visualization, create your own buckets...you could even insert your own data (using ChatGPT, see below) to add variables like number of albums, years in industry, gender, genre, etc.
But I certainly wasn't going to do that manually. Instead, I used ChatGPT.
Click here to see the prompts I used and get a copy of the data in CSV format. There are two spreadsheets available: One has the entire artist's name in one column. Which...is just bad data practice, right? So I also created a second spreadsheet that has two columns for their first and last names. Artists with one name are populated in the First Name column. This leads to a little funkiness for some artists who have two-word names, but the last word isn't a last name: Charlie XCX, Jessie J, etc., but I didn't feel like creating prompts to deal with these situations (but feel free to do so and share with me!).