Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added .DS_Store
Binary file not shown.
162 changes: 156 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,158 @@
# Project 1
# ElasticNet Regression

## Table of Contents
- Project Overview
- Key Features
- Setup and Installation
- Model Explanation
- How to Use
- Code Examples
- Adjustable Parameters
- Known Limitations
- Contributors
- Q&A

## Project Overview
This project provides a fully custom implementation of ElasticNet Regression, built from the ground up using only NumPy and pandas. No external machine learning libraries like scikit-learn or TensorFlow have been used. This implementation aims to provide a clear understanding of ElasticNet's operation and shows how the model may be optimized via gradient descent.

Combining L1 (Lasso) and L2 (Ridge) regularization, ElasticNet is a linear regression model that works well for tasks involving correlated features or feature selection. Gradient descent is utilized for model optimization.

## Key Features
- **Custom ElasticNet Regression**: Implements both L1 (Lasso) and L2 (Ridge) regularization for linear regression.
- **Gradient Descent Optimization**: Manually optimizes weights using gradient descent, allowing full control over the learning process.

## Setup and Installation
### Prerequisites
- Python 3.x
- NumPy
- pandas

### Installation
1. Clone this repository:
```bash
git clone https://github.com/priyanshpsalian/ML_Project1.git
```

3. Create a virtual environment
```bash
python3 -m venv .venv
```
4. Activate the environment
```bash
source .venv/bin/activate # On Unix or MacOS
.venv\Scripts\activate # On Windows
```
2. Install the required dependencies:
```bash
pip install -r requirements.txt
```

5. Run the test file to see the results
```bash
python3 -m elasticnet.tests.test_ElasticNetModel
```

## Model Explanation
Combining the benefits of L1 and L2 regularization, ElasticNet is a regularized version of linear regression. It functions effectively in cases when we want both variable selection (L1) and coefficient shrinkage (L2), or when features are coupled.

### Objective Function

The objective of ElasticNet is to minimize the following:


- **Alpha** controls the strength of regularization.
- **l1_ratio (Lambda)** determines the mix between L1 (Lasso) and L2 (Ridge).

## How to Use
You can initialize and train the ElasticNet model using the provided `ElasticNetModel` class:
```python
from elasticnet import ElasticNetModel

# Initialize the model
model = ElasticNetModel(alpha=1.0, l1_ratio=0.5, max_iter=2000, convergence_criteria=1e-4, step_size=0.005, bias_term=True)

# Fit the model to data
model.fit(X_train, y_train)

# Make predictions
predictions = model.predict(X_test)

```
## Code Examples
This is just an example code:
```python
# Fit the model
outcome = model.fit(X_train, y_train)

# Predict
y_pred = outcome.predict(X_test)

# Evaluate
r2 = outcome.r2_score(y_test, y_pred)
rmse = outcome.rmse(y_test, y_pred)

print(f"R² Score: {r2}")
print(f"RMSE: {rmse}")
```

## Adjustable Parameters

- **self**: The instance of the ElasticNetModel class that this method is called from - used to access class properties and other methods.
- **alpha**: Overall strength of regularization. Must be a positive float. Its value is 1 by default.
- **l1_ratio**: This parameter balances the L1 (Lasso) and L2 (Ridge) penalties, where 0 indicates pure Ridge and 1 indicates pure Lasso. The default setting is 0.5
- **max_iter**: The maximum number of passes over the training data. It defines the number of iterations in gradient descent optimization. Higher values allow more fine-tuning, at the cost of more computation. The default is 2000
- **convergence_criteria**: Tolerance for stopping criteria. If the difference between iterations is less than this value, then the training stops. The default is 1e-4
- **step_size**: Step size determines the amount that coefficients are altered during each step of gradient descent. Small values can lead to slower convergence but more precise results. Default = 0.005
- **bias_term**: Boolean indicating if an intercept should be added to the model. If True, then an intercept term will be added. Default = True


## Known Limitations
Increasing Decline Convergence: In situations with significant multicollinearity or on huge datasets, the model may converge slowly. Convergence may be enhanced by alternative optimization methods like coordinate descent.

Precision: Compared to closed-form solutions, gradient descent may not be able to reach the level of precision needed for some applications.


## Contributors
- Priyansh Salian (A20585026 psalian1@hawk.iit.edu)
- Shruti Ramchandra Patil (A20564354 spatil80@hawk.iit.edu)
- Pavitra Sai Vegiraju (A20525304 pvegiraju@hawk.iit.edu)
- Mithila Reddy (A20542879 Msingireddy@hawk.iit.edu)

## Q&A

### What does the model you have implemented do, and when should it be used?
ElasticNet Regression is designed to handle regression tasks involving multicollinearity (correlation between predictors) and feature selection. It combines the L1 (Lasso) and L2 (Ridge) penalties to strike a compromise between coefficient shrinkage and variable selection.

### How did you test your model to determine if it is working reasonably correctly?
The model was tested using synthetic data that had established correlations between predictors and target variables. R² and RMSE measures were used to assess the accuracy of the model by comparing predictions to actual values.

### What parameters have you exposed to users of your implementation in order to tune performance?
- **Alpha**: The strength of regularization, which manages both L1 and L2 penalties.
- **l1_ratio**: The proportion of L2 (Ridge) penalties to L1 (Lasso) penalties.
- **step_size**: The gradient descent step size.
- **max_iter**: The quantity of iterations used to optimize gradient descent.
- **convergence_criteria**: Tolerance for the halting criteria. If the progress between iterations is less than tol, the training process will end. Default is 1e-4.
- **bias_term**: Boolean indicating whether an intercept should be fitted. If True, an intercept term is introduced into the model. Default is True.

### Are there specific inputs that your implementation has trouble with? Given more time, could you work around these, or is it fundamental to the model?
Large datasets or datasets with extreme multicollinearity may provide challenges for the existing solution since gradient descent may converge slowly. Coordinate descent is one optimization technique that could be used to accelerate convergence if given extra time, particularly in high-dimensional environments.



















Put your README here. Answer the following questions.

* What does the model you have implemented do and when should it be used?
* How did you test your model to determine if it is working reasonably correctly?
* What parameters have you exposed to users of your implementation in order to tune performance? (Also perhaps provide some basic usage examples.)
* Are there specific inputs that your implementation has trouble with? Given more time, could you work around these or is it fundamental?
93 changes: 83 additions & 10 deletions elasticnet/models/ElasticNet.py
Original file line number Diff line number Diff line change
@@ -1,17 +1,90 @@
import numpy as np
import pandas as pd

class ElasticNetModel:
def __init__(self, **kwargs):
defaults = {
'alpha': 1.0,
'l1_ratio': 0.5,
'max_iter': 2000,
'convergence_criteria': 1e-4,
'step_size': 0.005,
'bias_term': True
}
defaults.update(kwargs)
self.parameter_values = None
self.average_value = None
self.standard_deviation = None

class ElasticNetModel():
def __init__(self):
pass
for key, value in defaults.items():
setattr(self, key, value)


def fit(self, X, y):
return ElasticNetModelResults()

def fit(self, X, y, categorical_features=None):
y = y.astype(float).flatten()
X = X.astype(float)
X = pd.DataFrame(X)
X = pd.get_dummies(X, drop_first=True, columns=categorical_features,)

class ElasticNetModelResults():
def __init__(self):
pass
# Scaling the features to a standard format.
self.average_value = X.mean(axis=0)
self.standard_deviation = X.std(axis=0)
X = (X - self.average_value) / self.standard_deviation
m, n = X.shape
self.parameter_values = np.zeros(n + 1) if self.bias_term else np.zeros(n)

def predict(self, x):
return 0.5
if self.bias_term:
X = np.hstack([np.ones((m, 1)), X])

# Gradient Descent Optimization
for iteration in range(self.max_iter):
p = X.dot(self.parameter_values)
mistake = p - y
derivative_array = (1 / m) * X.T.dot(mistake)

# Adjusting the intercept independently if bias_term parameter is set to True
if self.bias_term:
self.parameter_values[0] -= self.step_size * derivative_array[0]
derivative_array = derivative_array[1:]

p = self.parameter_values[1:]
l1 = self.l1_ratio * np.sign(p)
l2 = (1 - self.l1_ratio) * p
reg = self.alpha * (l1 + l2)
upd = self.step_size * (derivative_array + reg)
self.parameter_values[1:] -= upd


if np.linalg.norm(derivative_array, ord=1) < self.convergence_criteria:
break

return ElasticNetModelResults(self)

def predict(self, X):
X = X.astype(float)
if not isinstance(X, pd.DataFrame):
X = pd.DataFrame(X)

X = (X - self.average_value) / self.standard_deviation
if self.bias_term:
X = np.hstack([np.ones((X.shape[0], 1)), X])
return X.dot(self.parameter_values)

class ElasticNetModelResults:
def __init__(self, model):
self.model = model

def predict(self, X):
return self.model.predict(X)

def r2_score(self, y_true, y_pred):
o = np.asarray(y_true)
p = np.asarray(y_pred)
t = np.sum((o - np.mean(o)) ** 2)
r = np.sum((o - p) ** 2)
r2 = 1 - (r / t)
return r2

def rmse(self, t, p):
return np.sqrt(np.mean((t - p) ** 2))
1 change: 1 addition & 0 deletions elasticnet/models/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
from .ElasticNet import ElasticNetModel
69 changes: 60 additions & 9 deletions elasticnet/tests/test_ElasticNetModel.py
Original file line number Diff line number Diff line change
@@ -1,19 +1,70 @@
import csv

import numpy

from elasticnet.models.ElasticNet import ElasticNetModel
import csv
from ..models.ElasticNet import ElasticNetModel
from ..models.ElasticNet import ElasticNetModelResults

def test_predict():
model = ElasticNetModel()
data = []
with open("small_test.csv", "r") as file:
with open("elasticnet/tests/small_test.csv", "r") as file:
reader = csv.DictReader(file)
for row in reader:
data.append(row)

X = numpy.array([[v for k,v in datum.items() if k.startswith('x')] for datum in data])
y = numpy.array([[v for k,v in datum.items() if k=='y'] for datum in data])
results = model.fit(X,y)
preds = results.predict(X)
assert preds == 0.5
X = X.astype(float)
y = y.astype(float).flatten()
# Data is beeing split into training and testing data
split_idx = int(0.8 * len(X))
X_train, X_test = X[:split_idx], X[split_idx:]
y_train, y_test = y[:split_idx], y[split_idx:]

# Hyperparameter Optimization through Cross-Validation
Cross_validation_score_best = -numpy.inf
leading_parameters = {}

kcf = 5
segment_length = len(X_train) // kcf

for alpha in [0.001, 0.01, 0.1, 1.0, 10.0, 100.0]:
for l1_ratio in [0.1, 0.3, 0.5, 0.7, 0.9]:
validation_scores = []
for i in range(kcf):
X_train_segment = numpy.concatenate((X_train[:i*segment_length], X_train[(i+1)*segment_length:]), axis=0)
y_train_segment = numpy.concatenate((y_train[:i*segment_length], y_train[(i+1)*segment_length:]), axis=0)
X_validation_subset = X_train[i*segment_length:(i+1)*segment_length]
y_validation_subset = y_train[i*segment_length:(i+1)*segment_length]

temp_model = ElasticNetModel(alpha=alpha, l1_ratio=l1_ratio, max_iter=2000, convergence_criteria=1e-4, step_size=0.005, bias_term=True)
temp_model.fit(X_train_segment, y_train_segment)
predicted_y_values = temp_model.predict(X_validation_subset)
model_results = ElasticNetModelResults(temp_model)
validation_scores.append(model_results.r2_score(y_validation_subset, predicted_y_values))

mean_evaluation = numpy.mean(validation_scores)
if mean_evaluation > Cross_validation_score_best:
Cross_validation_score_best = mean_evaluation
leading_parameters = {'alpha': alpha, 'l1_ratio': l1_ratio}

# Display the optimal results of the model
print("--- Optimal Model Performance Metrics ---")
print(f"Optimal R² Value Achieved Through Cross-Validation: {Cross_validation_score_best:.4f}")
print(f"Optimal Alpha Value for Model Performance: {leading_parameters['alpha']}")
print(f"Optimal L1 Ratio Value for Model Performance: {leading_parameters['l1_ratio']}")

# Build the Final Model Using Optimal Configuration Settings
final_model = ElasticNetModel(max_iter=2000, convergence_criteria=1e-4, step_size=0.005, alpha=leading_parameters['alpha'], l1_ratio=leading_parameters['l1_ratio'], bias_term=True)
results = final_model.fit(X_train, y_train)

# Generating predictions for the testing data set
y_pred_test = results.predict(X_test)
# Setting Up the Model for Performance Evaluation
result_model = ElasticNetModelResults(final_model)

# Computing and Presenting Evaluation Metrics
print("----------------------------------------------")
print(f"--- Performance Evaluation Summary on the Test Data Set---")
print(f"R² Score: {result_model.r2_score(y_test, y_pred_test):.4f}")
print(f"RMSE: {result_model.rmse(y_test, y_pred_test):.4f}")

test_predict()

This file was deleted.

19 changes: 0 additions & 19 deletions regularized_discriminant_analysis/test_rdamodel.py

This file was deleted.

1 change: 1 addition & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
numpy
pytest
ipython
pandas