Movie Recommender and Catalog System

Introduction

This project is a web-based movie recommendation application developed as a final project for the Fundamentals of Data Science. It addresses the "choice overload" and "cold-start" problems common in streaming platforms by providing intelligent, content-based suggestions without requiring user history.

The system utilizes a hybrid approach:

Weighted Content-Based Filtering: Uses TF-IDF Vectorization and Cosine Similarity to find semantically similar movies, with higher weights assigned to Directors and Cast to capture "auteur" style and star power.
Bayesian Quality Scoring: Implements the IMDb weighted rating formula to ensure that recommended movies are statistically high-quality, balancing raw ratings with vote counts.

Key Features

Content-Based Recommender: Suggests movies based on a "Weighted Soup" of metadata (Director x3, Cast x2, Keywords x1, Genres x1).
Browse by Star: Search for movies featuring specific Actors or Directors. Includes a "Cold Start" fix that suggests popular stars if the search is empty.
Surprise Me!: A discovery feature that randomly selects a high-quality movie from the top-rated 500 films.
Smart Catalog: A full, paginated library of over 4,800 movies, filterable by Genre and sorted by Bayesian Quality Score.
Interactive Metadata: All Directors, Cast members, and Genres are clickable, allowing seamless navigation to related content.
Modern UI: A responsive, dark-themed interface built with Tailwind CSS.

Tech Stack

Backend: Python, Flask
Data Manipulation: Pandas, NumPy
Machine Learning: Scikit-learn (TF-IDF, Cosine Similarity)
Frontend: HTML5, Tailwind CSS (via CDN)

Running the Application

Prerequisites

Before you begin, ensure you have the following installed on your system:

Python 3.x (This program is made in 3.14, but any version of 3.x python should work.)
The pip package manager

Installation and Setup

Follow these steps to set up and run the application locally.

Clone the Repository

git clone [https://github.com/MichaelFirstAC/MovieCatalog.git](https://github.com/MichaelFirstAC/MovieCatalog.git)
cd MovieCatalog

Install Dependencies Install the required Python libraries using pip:

pip install flask pandas scikit-learn

Prepare the Data and Model (One-Time Setup)

The application requires the raw CSV files (tmdb_5000_movies.csv and tmdb_5000_credits.csv) to be present in the root directory.

Note: If these files are zipped (archive.zip), please extract them into the root folder first.

Run the prepare_model.py script. This script will:

Clean and parse the JSON datasets.
Calculate the Bayesian Quality Score for every movie.
Build the TF-IDF and Cosine Similarity matrices.
Save the processed models (movies.pkl and cosine_sim.pkl).

python prepare_model.py

Run the Web Application Once the model files are generated, start the Flask server:

python app.py

You should see output indicating the server is running, typically on http://127.0.0.1:5000/.

Access the Application Open your web browser and navigate to:

http://127.0.0.1:5000/

Project Structure

app.py: The main Flask application containing routing logic and the recommendation engine.
prepare_model.py: The data pipeline script for cleaning, feature engineering, and model training.
templates/index.html: The unified frontend template handling all views (Home, Catalog, Browse, etc.).
static/: Contains CSS assets and team images.
movies.pkl & cosine_sim.pkl: Serialized model files generated by the preparation script.
OTHER FILES OTHER THAN THE ONES MENTIONED ARE NOT REQUIRED FOR THE PROGRAM TO RUN, THEY ARE ALL DOCUMENTATION FILES.

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
.idea		.idea
Backup		Backup
MovieCatalog		MovieCatalog
static/images		static/images
templates		templates
Figure_1.png		Figure_1.png
Figure_2.png		Figure_2.png
README.md		README.md
academic_tools.py		academic_tools.py
app.py		app.py
archive.zip		archive.zip
prepare_model.py		prepare_model.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Movie Recommender and Catalog System

Introduction

Key Features

Tech Stack

Running the Application

Prerequisites

Installation and Setup

Project Structure

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

MichaelFirstAC/MovieCatalog

Folders and files

Latest commit

History

Repository files navigation

Movie Recommender and Catalog System

Introduction

Key Features

Tech Stack

Running the Application

Prerequisites

Installation and Setup

Project Structure

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages