Skip to main content

Week 8 Prep: Project 3 Intro

Project 3 Introduction

For Project 3, I will be creating a regression model in order to predict song popularity based on musical parameters, such as tempo, energy, loudness, time signature, and other features. I will use the Spotify 1 Million Tracks Dataset from Kaggle, which contains a large set of over 1,000,000 tracks from Spotify, along with 19 features. The dataset appears to be mostly clean and ready for modeling; however, some preprocessing still may be necessary.

Key Questions

From this Project, I aim to answer the following questions:

  1. Can we accurately predict the popularity of a song using solely its musical parameters?
  2. Can we accurately predict the popularity of a song prior to its public release?
  3. Which features are most important to predicting song popularity, irregardless of artist?

Impact

By building a model that predicts song popularity from its musical parameters, we are able to shed some light onto the specific elements -- genres, tempos, and styles of music -- may lead to a song's success. The findings from this project could help producers, record labels, and indie artists by estimating the popularity of their songs, even before they are released.

Additionally, a popularity score could prove to be especially valuable for marketing and promotional strategies by providing a quantifiable estimate of the song's popularity. Ultimately, this may help level the playing field for all artists in the industry, and allow indie producers to get their work out to ears across the world more easily.

However, a potential negative impact of this study is that it may encourage producers to conform to a data-driven formula for success. This could potentially limit artistic creativity and diversity in music, in favor of getting the most "clicks" or "plays."