Spotify Data Analysis: Engineering Musical Success

Feature-engineering analysis of Spotify track data to study how audio features relate to track popularity. The project compares models built on original Spotify features versus engineered composite features, then examines genre-specific behavior. Main deliverable: a rendered report from presentation.Rmd.

Dataset

File: spotify.csv
Size: ~114,000 tracks (114,001 lines including header)
Content: track metadata + audio features + popularity target
Key fields include track_id, artists, track_name, popularity, danceability, energy, loudness, speechiness, acousticness, instrumentalness, liveness, valence, tempo, track_genre
Note: CSV includes a leading unnamed index-like column

Analysis Workflow

The report follows this sequence:

Load and inspect Spotify data.
Explore distributions/correlations of original audio features.
Engineer composite features.
Compare engineered vs original features with correlation and linear models.
Run simplified feature-selection comparisons.
Analyze genre-specific differences and model behavior.
Translate findings into practical production-oriented interpretation.

Engineered features used in the analysis:

energy_loudness_ratio
mood_score
acoustic_electronic_balance
human_presence
complexity_score
energetic_dance_factor
Derived categories such as energy_level, valence_category, and genre-normalized danceability

Quick Start

Prerequisites:

R installed
LaTeX renderer such as pdflatex

From the repository root:

R -e "install.packages('renv', repos='https://cloud.r-project.org')"
R -e "renv::restore()"
Rscript -e "rmarkdown::render('presentation.Rmd')"

Expected output:

presentation.pdf (created or updated)

Scope and Limitations

Observational analysis of one dataset; no causal claims.
Popularity is modeled from available audio/metadata features only.
Feature-engineering impact is not uniform across genres.
Results depend on dataset composition and preprocessing choices.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
renv		renv
.Rprofile		.Rprofile
.gitignore		.gitignore
README.md		README.md
presentation.Rmd		presentation.Rmd
presentation.pdf		presentation.pdf
renv.lock		renv.lock
spotify-data-analysis.Rproj		spotify-data-analysis.Rproj
spotify.csv		spotify.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Spotify Data Analysis: Engineering Musical Success

Dataset

Analysis Workflow

Quick Start

Scope and Limitations

About

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Spotify Data Analysis: Engineering Musical Success

Dataset

Analysis Workflow

Quick Start

Scope and Limitations

About

Resources

Uh oh!

Stars

Watchers

Forks

Uh oh!

Contributors

Uh oh!

Languages