All data is anonymized to protect user privacy. Each playlist in the MPD contains a playlist title, the track list (including track IDs and metadata), and other metadata fields (last edit time, number of playlist edits, and more). The playlists were created by Spotify users between January 2010 and November 2017. Sampled from the over 2 billion public playlists on Spotify, this dataset of 1 million playlists consist of over 2 million unique tracks by nearly 300,000 artists, and represents the largest dataset of music playlists in the world. To enable this type of research at scale, earlier this year we released The Million Playlist Dataset (MPD) to the academic research community. Spotify’s “Recommended Songs” feature suggests songs to add to a playlist This can make playlist creation easier, and ultimately help people find more of the music they love. Why do certain songs go together? What is the difference between “ Beach Vibes” and “ Forest Vibes”? And what words (and emojis) do people use to describe which playlists?īy learning more about nature of playlists, we may also be able to suggest other tracks that a listener would enjoy in the context of a given playlist.
By learning from the playlists that people create, we can learn all sorts of things about the deep relationship between people and music. The other thing we love here at Spotify is playlist research. The dataset contains 1,000,000 playlists, including playlist titles and track titles, created by users on the Spotify platform. It is a continuation of the RecSys Challenge 2018, which ran from January to July 2018. I told my crush I liked them through a Spotify playlist /f51lfkIMQv The Spotify Million Playlist Dataset Challenge consists of a dataset and evaluation to enable research in music recommendations. focus, workout). Some playlists are even made to land a dream job, or to send a message to someone special.
romantic, sad, holiday), or for a particular purpose (e.g. by genre, artist, year, or city), by mood, theme, or occasion (e.g. People create playlists for all sorts of reasons: some playlists group together music categorically (e.g. To date, over 2 billion playlists have been created and shared by Spotify users. In fact, the Digital Music Alliance, in their 2018 Annual Music Report, state that 54% of consumers say that playlists are replacing albums in their listening habits.īut our users don’t love just listening to playlists, they also love creating them. I walked the length of my platform just to read them all /vV7LbW8zpM The Spotify Million Playlist Dataset Challenge consists of a dataset and evaluation to enable research in music recommendations. Welcome to the Last.fm dataset, the official song tag and song similarity dataset of the Million Song Dataset. We have the beginning of a fix, a list of song - track pairs that should not be trusted, get it here. Playlists like Today’s Top Hits and RapCaviar have millions of loyal followers, while Discover Weekly and Daily Mix are just a couple of our personalized playlists made especially to match your unique musical tastes.īig fan of the new Spotify campaign. Robert Wests Applied Data Analysis class of Autumn 2017, we decided to focus on one of the freely-available largest collection of music data sets online: the. Last.fm data was matched using songs, so it is likely affected. Here is a truncated and annotated version of the file CW/SOCWJDB12A58A776AF.Here at Spotify, we love playlists. An R project that investigates whether different genres of songs have significantly different durations through the use of a one-way ANOVA test and post hoc significance tests conducted over an excerpt of a dataset consisting of 1 million popular songs compiled by The Echo Nest and a lab at Columbia University. We have not validated any of the data in the archive.
The files are named in directories based on the 2ndĪnd 3rd letters of the Song ID, i.e., XX/SOXXnnnnnnnnnn.json
MILLION SONG DATA SET ARCHIVE
The archive has JSON files containing the results of looking up each DATA SET We used the Million Song Dataset (MSD). Our project, though similar in end goal, ignores artist familiarity and focuses more on feature extraction, especially with regards to the acoustic features of each song. Use the MSD SQLite database file to map Song IDs to Track IDs. song popularity than metadata such as genre labels or artist familiarity. Note that the track list in these files does not include the Million Songĭataset track ID. This data is a subset of the Million Song Dataset: a collaboration between LabROSA (Columbia University) and The Echo Nest.