Table of Contents

Project Statement and Goals
Motivation and Background
Data Description
EDA
Data Cleaning
Metrics
Model Training
Interpreting the Model
Model Testing and Results
Conclusion and What’s Next
Literature Review

Conclusion and What’s Next

Conclusion

We have far from cutting edge results, but we built a system that returns reasonable results and is close in r-precision to some teams on the Spotify MPD Competition site: see here.

We gained experience working with a large dataset in which observations are largely categorical text data. We have gleaned insight into what happens on Spotify servers when we load up this week’s ‘Your Discovered Weekly’ and hit play. More broadly, we have enaged with the inner workings of technology that for most people just works.

Expanding the Project’s Scope

Working with 1,000,000 Playlists

One idea we could use to better handle computing on the full dataset is principal component analysis. By narrowing down our set of features we could train models faster and experiment more with model tuning. We did create a visualization to see how we can differentiate between match = 0 and match = 1 tracks using the first 2 principal components. We can see that even just two components with 41,000 playlists loaded start to discriminate between our two target categories. It would be fascinating to think more about how this works and what the right number of components would be for an accurate model that is also easier to train on a large percentage of the data.

Principal Component Analysis code:

#Can't do sklearn PCA on sparse matrix object, use TruncatedSVD instead
from sklearn.decomposition import TruncatedSVD
pca_transformer = TruncatedSVD(2).fit(X_train) 
X_train_2d = pca_transformer.transform(X_train)
X_test_2d = pca_transformer.transform(X_test)

Plot Credit to HW4 and lab for code template

colors = ['b', 'r']
label_text = ["Not Match", "Match"] 
plt.figure(figsize = (10,6))
for cur_quality in [0, 1]:
    cur_df = X_train_2d[y_train == cur_quality] 
    plt.scatter(
        cur_df[:, 0],
        cur_df[:, 1], 
        c=colors[cur_quality], 
        label=label_text[cur_quality])
plt.xlabel("PCA Dimension 1", fontsize = 13)
plt.ylabel("PCA Dimention 2", fontsize = 13)
plt.title("Scatter plots of top 2 PCA Components for Feature Data", fontsize = 15) 
plt.tick_params(labelsize = 13)
plt.grid(alpha = 0)
plt.legend();

Note the Data Cleaning section of this site: we have populated each playlist in our dataset with randomly choosen songs that were not originally present. These songs are equal in number to the songs that actually belong to the playlist as choosen by Spotify users. The songs that were originally present we label with the target category match = 1 and the randomly assigned songs are labeled with the target category match = 0 fig1

Separating a Validation Set and Tuning Hyperparameters

With more time we could split our data into a 60-20-20% train-validation-test split and really dig into discovering the best hyperparameters. We could get further intuition for AbaBoost and ground our parameter decisions by visualizing how test performance varies by number of estimators and depth of tree.

More Models

We would love to try more models and possibly even experiment with stacking them. It was an ensemble method that won the famous Netflix Competition, and stacking has proven itself time and again since for some of the most difficult classification tasks. Perhaps we could find the faults with AdaBoost for this task and discover other models that compliment its weaknesses. It will be particularly instructive to see how neural networks could handle the MPD. We’ve also heard wonderful things about the potential of XGBoost and CatBoost so we will explore.

Taking advantage of the Million Song Dataset

The MSD holds song specific feature information that would be useful for a model to be able to further recognize similarities between songs and recommend songs to a user that exhibit the qualities that user indicates they like by the name and description of their playlist, as well as the songs they already added.

Leveraging Spotify’s API

R-Precision is not the most heartening metric. The most exciting part of this project is the ability to make recommendations for our own playlists to get a more palpable sense of the model’s ability. The function that we created for making our own playlists, detailed in Metrics, leaves some things to be desired. By using the Spotify API, we could more fluidly choose tracks for unique playlists and interact with our model.