A full-stack machine learning workbench that runs entirely on your local machine. Upload a CSV, explore and clean your data, configure a classical ML model or a custom deep neural network, train it with real-time progress streaming, and export the results β all through a browser UI backed by a Python/Flask API.
- Upload any CSV file (chunked reading β handles large files)
- Load built-in sklearn datasets: Iris, Wine, Diabetes, Breast Cancer
- Automatic column type detection (numeric / categorical / text)
- Paginated table view with sort and filter on any column
- Per-column statistics (min, max, mean, std, median, quartiles, top values)
- Inline column operations: rename, drop, change type, set as target
- Row operations: delete selected rows, remove duplicates
- Full dataset reset to original uploaded state
| Option | Choices |
|---|---|
| Missing value strategy | Mean, Median, Mode, Zero fill, Drop rows, KNN Imputation |
| Categorical encoding | One-Hot, Label Encoding |
| Feature scaling | StandardScaler, MinMaxScaler, RobustScaler, MaxAbsScaler |
| Outlier removal | IQR-based automatic removal |
| Duplicate removal | β |
| Feature selection | SelectKBest (top-K) |
| Dimensionality reduction | PCA (configurable variance threshold) |
| Category | Algorithms |
|---|---|
| Classification | Random Forest, Gradient Boosting, XGBoost, SVM, KNN, Logistic Regression, Decision Tree, Naive Bayes, AdaBoost |
| Regression | Linear Regression, Ridge, Lasso, ElasticNet, SVR, Random Forest, Gradient Boosting, KNN |
| Clustering | K-Means, DBSCAN |
| Anomaly Detection | Isolation Forest |
- Visual layer builder β add/remove Dense layers with per-layer neurons, activation, dropout
- Quick architecture presets: Tiny, Small, Medium, Large, XLarge
- 10 activations: ReLU, GELU, Swish, Sigmoid, Tanh, LeakyReLU, ELU, Softmax, Linear, PReLU
- 7 optimizers: Adam, AdamW, SGD, RMSprop, Adagrad, Nadam, Adadelta
- 6 weight initializers: Xavier/Glorot, He Normal, He Uniform, LeCun Normal, Orthogonal, Random Normal
- LR schedulers: ReduceOnPlateau, CosineAnnealing, StepDecay, CyclicLR, WarmupLinear
- Early stopping with configurable patience
- L2 regularization, bias toggle, configurable batch size
- Live loss/accuracy curves streamed epoch-by-epoch via SSE (Server-Sent Events)
- Staged progress for Gradient Boosting (per-estimator updates)
- Final metrics panel: Accuracy, F1, Precision, Recall, RΒ², MAE, RMSE, Silhouette Score
- Confusion matrix + full classification report
- Trained model file (
.pklfor classical,.h5for Keras) - Training history as CSV
- Processed/cleaned dataset as CSV
AI Training Assistant/
βββ backend.py # Flask API server
βββ requirements.txt # Python dependencies
βββ ml_uploads/ # Uploaded CSV files (auto-created)
βββ ml_models/ # Saved trained models (auto-created)
βββ frontend/ # Vite + React app
βββ src/
β βββ App.jsx # Main frontend (full UI)
β βββ main.jsx # React entry point
βββ package.json
βββ vite.config.js
- Python 3.10+
- Node.js 18+
- npm 9+
git clone https://github.com/your-username/ai-training-assistant.git
cd ai-training-assistant# Create and activate a virtual environment
python -m venv .venv
# Windows
.venv\Scripts\activate
# macOS / Linux
source .venv/bin/activate
# Install dependencies
pip install -r requirements.txtcd frontend
npm installOpen two terminals:
Terminal 1 β Backend (from the project root):
# Windows
.venv\Scripts\python.exe backend.py
# macOS / Linux
python backend.pyYou should see:
=======================================================
AI Training Assistant β Backend
=======================================================
TensorFlow : β 2.21.0
XGBoost : β
Uploads : /path/to/ml_uploads
Models : /path/to/ml_models
=======================================================
Running at http://localhost:5000
Terminal 2 β Frontend (from the frontend/ folder):
cd frontend
npm run devThen open http://localhost:5173 in your browser.
- Click Upload CSV to load your own dataset, or
- Click one of the built-in sample datasets (Iris, Wine, Diabetes, Breast Cancer)
- Browse your data in the Data Explorer table
- Click any column header to sort; use the filter bar to search
- Right-click a column name to rename, drop, change its type, or view statistics
- Select rows and delete them, or use Remove Duplicates
- In the column list, click the column you want to predict and mark it as Target
- Go to the Preprocess tab
- Choose your missing value strategy, encoding, scaling, and any optional steps (outlier removal, feature selection, PCA)
- Click Apply Preprocessing β a log shows exactly what was done
- Choose Problem Type (Classification, Regression, Clustering, etc.)
- Toggle between Classical ML and Deep Learning
- Classical: pick an algorithm from the dropdown
- Deep Learning: build your layer stack visually or choose a preset
- Set hyperparameters (epochs, learning rate, batch size, optimizer, scheduler, etc.)
- Click βΆ Start Real Training
- Watch the loss/accuracy curves update in real time
- When training completes, metrics are shown automatically
- Download the trained model (
.pklor.h5) - Download training history as CSV
- Download the processed dataset as CSV
| Method | Endpoint | Description |
|---|---|---|
POST |
/api/upload |
Upload a CSV file |
GET |
/api/sample/<name> |
Load a built-in sklearn dataset |
GET |
/api/dataset/preview |
Paginated table data |
GET |
/api/dataset/stats/<col> |
Column statistics |
POST |
/api/column/drop |
Drop a column |
POST |
/api/column/rename |
Rename a column |
POST |
/api/column/target |
Set target column |
POST |
/api/column/type |
Change column dtype |
POST |
/api/column/fillnull |
Fill nulls in a column |
POST |
/api/rows/delete |
Delete rows by index |
POST |
/api/rows/dedup |
Remove duplicate rows |
POST |
/api/dataset/reset |
Reset to original data |
POST |
/api/preprocess |
Run preprocessing pipeline |
POST |
/api/train/stream |
Start training (SSE stream) |
GET |
/api/model/export |
Download trained model |
GET |
/api/history/csv |
Download training history |
GET |
/api/dataset/export |
Download processed dataset |
GET |
/api/health |
Backend health check |
GET |
/api/session/info |
Current session metadata |
| Layer | Technology |
|---|---|
| Frontend | React 19, Vite, Recharts |
| Backend | Python, Flask, Flask-CORS |
| ML β Classical | scikit-learn 1.8, XGBoost 3.2 |
| ML β Deep Learning | TensorFlow 2.21 / Keras |
| Data | pandas, NumPy |
| Streaming | Server-Sent Events (SSE) |
Backend shows 404 on /
This is normal. The backend has no homepage β all routes are under /api/. Visit http://localhost:5000/api/health to confirm it's running.
recharts not found error in frontend
cd frontend
npm install rechartsTensorFlow not detected Deep learning is disabled but classical ML still works. To enable TF:
pip install tensorflow==2.21.0Port already in use
Change the backend port at the bottom of backend.py:
app.run(debug=False, host="0.0.0.0", port=5001, threaded=True)And update frontend/src/App.jsx line 8:
const API = "http://localhost:5001/api";MIT License β free to use, modify, and distribute.
Pull requests are welcome. For major changes, open an issue first to discuss what you'd like to change.