A comprehensive framework for developing, training, and evaluating deep reinforcement learning models for algorithmic trading.
This Final Year Project (FYP) implements an end-to-end pipeline for algorithmic trading using deep reinforcement learning. The system downloads financial data, preprocesses it for ML applications, trains RecurrentPPO models, and provides comprehensive backtesting against traditional strategies. The framework supports multi-asset trading environments with advanced risk management and performance analysis.
- Multi-asset support - Trade stocks, ETFs, commodities, crypto, and forex
- Advanced preprocessing - 15+ technical indicators with bias-free normalization
- Deep RL training - RecurrentPPO with LSTM networks and hyperparameter optimization
- Comprehensive backtesting - Compare DRL models against 5 traditional strategies
- Risk management - Custom reward functions with risk-adjusted metrics
- Visualization suite - Performance radar charts, heatmaps, and statistical analysis
- Docker environment - Containerized development with VS Code integration
- Production ready - Scalable architecture with proper logging and error handling
fyp/
├── docker/ # Containerized development environment
│ ├── Dockerfile # Python 3.11 with VS Code server
│ ├── startup.sh # Automated setup script
│ └── README.md # Docker usage instructions
├── download/ # Data acquisition tools
│ ├── 1-check-yahoo-symbols.py # Symbol validation
│ ├── 2-download-yahoo-stock-data.py # Historical data download
│ ├── tickers.json # 60+ symbols across 8 asset classes
│ └── README.md # Download tools documentation
├── preprocessing/ # 6-step ML data pipeline
│ ├── 1-add-indicators.py # Technical indicators (MACD, RSI, etc.)
│ ├── 2-normalization.py # Feature-specific normalization
│ ├── 3-date-cap.py # Date range filtering
│ ├── 4-cleaning.py # NaN handling
│ ├── 5-split-fix.py # Temporal train/val/test splits
│ ├── 6-prepare-gym-compatible-data.py # Gym environment formatting
│ └── README.md # Preprocessing pipeline guide
├── backtesting/ # Strategy evaluation framework
│ ├── baselines.py # 5 traditional trading strategies
│ ├── 1-run-baselines.py # Baseline strategy backtesting
│ ├── 2-rl-backtest.py # DRL model backtesting
│ ├── 3-visualize.py # Performance visualization
│ └── README.md # Backtesting framework guide
├── docs/ # Documentation and guides
├── reward.py # Custom reward functions
├── rl-multi.py # Multi-dataset RL training
├── rppo-mul-opt.py # Hyperparameter optimization
├── toolkit.ipynb # Development notebook
├── requirements.txt # Python dependencies
└── LICENSE # MIT License
🤗 Hugging Face Dataset: xinghao2003/fyp
This repository includes preprocessed datasets, trained models, and comprehensive backtesting results:
- Preprocessed Data - Ready-to-use datasets for 16 representative symbols across all asset classes
- Trained Models - Pre-trained RecurrentPPO models with optimized hyperparameters
- Backtesting Results - Complete performance analysis comparing DRL models against traditional strategies
- Visualization Outputs - Performance charts, heatmaps, and statistical analysis
The dataset demonstrates the complete end-to-end pipeline from raw financial data to trained models with backtesting results, making it easy to reproduce the research and build upon the work.
# Download example dataset and models from Hugging Face
# Visit: https://huggingface.co/datasets/xinghao2003/fyp
# Extract and use pre-trained models for backtesting
cd backtesting/
python 2-rl-backtest.py --model downloaded_model.zip --params downloaded_params.json --data_folder test_data/Docker (Recommended)
# Build and run development container
cd docker/
docker build -t fyp-trading .
docker run -d --name fyp-dev fyp-trading
docker logs -f fyp-dev
# Follow GitHub device login instructions in logs
# Access via browser: https://vscode.dev/tunnel/fypLocal Installation
# Clone repository
git clone <repository-url>
cd fyp
# Install dependencies
pip install -r requirements.txt# Validate symbols
cd download/
python 1-check-yahoo-symbols.py tickers.json
# Download historical data (daily, maximum period)
python 2-download-yahoo-stock-data.py tickers.json --period max --interval 1dcd preprocessing/
# Run complete 6-step pipeline
python 1-add-indicators.py ../download/data/
python 2-normalization.py ../download/data/
python 3-date-cap.py ../download/data/
python 4-cleaning.py ../download/data/
python 5-split-fix.py ../download/data/
python 6-prepare-gym-compatible-data.py ../download/data/# Train RecurrentPPO model
python rl-multi.py
# Hyperparameter optimization with Optuna
python rppo-mul-opt.pycd backtesting/
# Run baseline strategies
python 1-run-baselines.py ../preprocessing/data/test/
# Backtest trained RL model
python 2-rl-backtest.py --model best_model.zip --params best_params.json --data_folder ../preprocessing/data/test/
# Generate comparison visualizations
python 3-visualize.py results/ --output-dir plots/We maintain pyproject.toml and uv.lock so you can rely entirely on uv for dependency installation; nothing else is required. Follow the upstream installation guide to install uv, then execute:
-
Verify
uvis available on your PATHuv --version
This step confirms that you followed the official installation instructions and ensures
uvcan run commands. -
Let
uvset up the environmentuv sync
uv syncreadspyproject.tomlanduv.lockso every transitive dependency is locked; the result works the same on every machine. -
Activate the environment
uv also respects the .python-version hint (Python 3.11), so you can skip manual Python installs when that matches your system. This pure uv workflow is now our default recommendation for new contributors.
- Download Tools - Acquire OHLCV data from Yahoo Finance for 60+ symbols
- Preprocessing - 6-step pipeline adding technical indicators and normalization
- Quality Assurance - NaN handling, date filtering, and temporal consistency
- Environment - Multi-dataset trading environment with gym interface
- Model - RecurrentPPO with LSTM networks for sequential decision making
- Optimization - Optuna-based hyperparameter tuning with pruning
- Rewards - Risk-adjusted reward functions with volatility penalties
- Baseline Strategies - 5 traditional strategies (Buy&Hold, SMA Cross, RSI, etc.)
- Performance Metrics - 15+ metrics including Sharpe ratio, max drawdown, win rate
- Statistical Testing - Significance tests for strategy comparison
- Visualization - Comprehensive charts and performance analysis
# Training Configuration
LOOKBACK_WINDOW = 60 # LSTM sequence length
TOTAL_TIMESTEPS = 1000000 # Training steps
LEARNING_RATE = 0.0003 # PPO learning rate
BATCH_SIZE = 64 # Training batch size
# Environment Settings
INITIAL_CASH = 100000 # Starting portfolio value
COMMISSION = 0.002 # 0.2% per trade
MAX_POSITIONS = 1 # Single position limit
# Data Splits
TRAIN_END = "2020-12-31" # Training data cutoff
VAL_END = "2024-12-31" # Validation data cutoff
TEST_START = "2025-01-01" # Test data startThe system supports 8 asset categories:
- U.S. Equity Benchmarks (SPY, QQQ, IWM, DIA)
- Sector/Thematic ETFs (XLB, XLE, XLF, XLK, XLV, etc.)
- Large-Cap Stocks (AAPL, MSFT, GOOGL, AMZN, etc.)
- Fixed-Income (TLT, IEF, LQD, HYG)
- Commodities (GLD, SLV, USO, DBA)
- Volatility (^VIX, VXX)
- Currencies (EUR/USD, USD/JPY, GBP/USD)
- Cryptocurrencies (BTC-USD, ETH-USD, SOL-USD)
The system calculates comprehensive performance statistics:
- Total Return, Annualized Return, CAGR
- Risk-free rate adjusted returns
- Volatility (annualized standard deviation)
- Maximum Drawdown, Average Drawdown
- Value at Risk (VaR), Conditional VaR
- Sharpe Ratio, Sortino Ratio, Calmar Ratio
- Information Ratio, Treynor Ratio
- Win Rate, Profit Factor, Expectancy
- Average Trade, Best/Worst Trade
- Number of Trades, SQN Score
The framework generates comprehensive visualizations:
- Performance Radar Charts - Multi-dimensional strategy comparison
- Risk-Return Scatter Plots - Efficient frontier analysis
- Performance Heatmaps - Strategy performance across different assets
- Statistical Significance Tests - Validate performance differences
- Equity Curves - Portfolio value progression over time
- Drawdown Analysis - Risk assessment and recovery periods
- Python 3.11 with optimized dependencies
- VS Code Server with tunnel access for remote development
- Git integration with automatic repository updates
- Persistent workspace with volume mounting
- Jupyter notebook support for interactive development
- Comprehensive logging with timestamped files
- Error handling and graceful degradation
- Progress tracking and batch processing capabilities
yfinance==0.2.65 # Financial data download
stockstats==0.6.5 # Technical indicators
stable-baselines3[extra]==2.6.0 # Deep RL algorithms
sb3-contrib==2.6.0 # Additional RL algorithms
gym-trading-env==0.3.5 # Trading environment
optuna==4.4.0 # Hyperparameter optimization
backtesting==0.6.4 # Strategy backtesting
matplotlib==3.10.3 # Plotting
seaborn==0.13.2 # Statistical visualization
This framework supports various research directions:
- Alternative Architectures - Transformer-based models, attention mechanisms
- Multi-Agent Systems - Portfolio allocation across multiple strategies
- Risk Management - Dynamic position sizing, stop-loss optimization
- Market Regime Detection - Adaptive strategies for different market conditions
- Feature Engineering - Alternative data sources, sentiment analysis
- Execution Modeling - Market impact, slippage, and realistic transaction costs
- Validate symbols before downloading
- Handle missing data and corporate actions
- Ensure temporal consistency across datasets
- Prevent lookahead bias in preprocessing
- Use proper train/validation/test splits
- Implement early stopping and model checkpointing
- Monitor overfitting with validation metrics
- Test on multiple market regimes
- Use realistic transaction costs and slippage
- Account for survivorship bias
- Test across multiple time periods
- Compare against appropriate benchmarks
- Implement proper error handling and logging
- Use configuration management for parameters
- Monitor model performance in production
- Have rollback procedures for model failures
- Follow the established code structure and documentation standards
- Add comprehensive tests for new features
- Update relevant README files
- Use consistent coding style and naming conventions
- Document any new dependencies or configuration changes
This project is licensed under the MIT License - see the LICENSE file for details.
- Yahoo Finance for financial data access
- Stable-Baselines3 for deep RL implementations
- Optuna for hyperparameter optimization
- The open-source trading and ML community
Note: This system is for educational and research purposes. Past performance does not guarantee future results. Always conduct thorough testing before deploying trading strategies with real capital.