10 Jun 2020
A lockdown collaboration with Esben, Soonmyeong and Rubin to generate personalised playlists based on your mood.
The aim of the lockdown quist is to create a personalised lockdown themed playlist.
To compile a pool of lockdown themed songs, we pulled songs together from COVID playlists on Spotify like this one:
Overall, we had 1329 unique songs from 10+ playlists, representing 750 different genres.
The ten most popular genres were:
('pop', 353),
('dance pop', 327),
('rock', 285),
('pop rap', 184),
('pop dance', 177),
('rap', 174),
('post-teen pop', 156),
('classic rock', 154),
('pop rock', 136),
('hip hop', 133)]
To select 20 songs for a user, we designed a quiz where each answer was mapped onto three different audio features: energy, valence (if a song is happy or sad) and danceability.
The answers were then averaged to create a mood vector (e.g. energy = 0.81, valence=0.73, and danceability=0.4).
We then used this vector to pick musical genres that matched the listener’s mood - this can thought of as a musical context. The idea behind this is to create a coherent playlist by aiming to reduce the number of WTFs (songs that seem really out of place).
Through the Spotify API, we can get a user’s listening profile: approximately 150 tracks that report a listener’s 50 recently most listened songs , 50 mid-term and 50 most listened to all time.
Using this profile, we then picked the 15 songs closest to the mood vector and extracted their genres. We use these genres to describe each user. Below is an example of the top genres in a user’s profile:
[('pop', 11),
('uk pop', 8),
('funk', 5),
('soul', 5),
('electro', 4),
('filter house', 4),
('indie rock', 4),
('post-teen pop', 4),
('alternative dance', 3),
('new rave', 3)]
To enrich the range of genres describing a user, we used a co-occurrence matrix across all genres of Spotify. I explain how this matrix is computed in my paper with Maria Astefanoei.
Here’s an example of what a subset of that matrix looks like:
We summed over all the genre rows present in the user’s profile dictionary, weighted by the frequency of the respective genre.
This means that genres frequently co-occurring with those in someone’s profile were also added, thus creating a greater diversity but while maintaining some coherence.
All the songs in the covid pool were then scored against this dictionary of genres. Songs that accurately matched the user’s genres were given high scores.
To give an example, Yann Tiersen is described by the following genres:
['bow pop', 'compositional ambient', 'french soundtrack']
The 50 best scoring songs were selected. Following that, a subset of 30 songs closest to the mood vector were picked to make sure the songs were still somehow in line with the initial mood vector. From these 30 songs, 20 songs were sampled randomly. This final sub-sampling is to ensure that when someone re-does the quiz, they still get some novelty in the playlist.
Here’s an example of a generated playlist:
An interesting lesson from this project: although the playlists were appreciated, people were much more interested in getting their “corona personality test” results.
Try it out here