This project demonstrates how to apply Reinforcement Learning (RL) combined with Human Feedback (RLHF) to optimize a movie recommender system. By using human feedback, the system continuously learns from users' interactions and improves its recommendations over time.
- Introduction
- Technologies Used
- Project Structure
- Installation Guide
- How to Use
- Model Training
- Evaluation
- Contributing
- License
- Acknowledgements
In this project, we build a recommender system based on the MovieLens dataset. We implement Reinforcement Learning techniques, such as Proximal Policy Optimization (PPO), to optimize the movie recommendation process. The recommender system integrates Human Feedback to improve over time based on users' ratings and choices.
This system is an example of applying RL and RLHF in a production-grade environment, allowing for dynamic recommendations and personalized user experiences.
- Reinforcement Learning: PPO (Proximal Policy Optimization), Q-learning
- Human Feedback Integration: User ratings and preferences to improve recommendations
- NLP: Preprocessing of user feedback using basic NLP techniques
- Deep Learning: TensorFlow and Keras for model training
- Data Handling: Pandas for data processing and management
- Visualization: Matplotlib, Seaborn for visualizing model performance and recommendations
- Deployment: Flask (for API deployment), Docker for containerization
movie-recommender/
βββ data/
β βββ movielens_data.csv # The MovieLens dataset
βββ models/
β βββ recommendation_model.h5 # The trained recommendation model
βββ notebooks/
β βββ 01_data_exploration.ipynb # Data exploration notebook
β βββ 02_data_preprocessing.ipynb # Data preprocessing notebook
β βββ 03_model_training.ipynb # Model training notebook
β βββ 04_reward_model_training.ipynb # Training the RL reward model
β βββ 05_rl_agent_training.ipynb # Reinforcement Learning agent training
β βββ 06_evaluation_and_analysis.ipynb # Model evaluation notebook
βββ src/
β βββ data_loader.py # Code to load and preprocess data
β βββ model.py # The recommendation model code
β βββ rl_agent.py # RL agent for recommendation
β βββ feedback_system.py # Human feedback integration
β βββ utils.py # Utility functions
βββ config/
β βββ config.py # Configuration file for hyperparameters
βββ app/
β βββ app.py # Flask app for serving the model
β βββ requirements.txt # Python dependencies
βββ logs/
β βββ training_logs.txt # Logs for model training and evaluation
βββ README.md # This file