🧠 Data Science Portfolio — Python-Scripts

A curated collection of hands-on projects and labs spanning the full modern ML/AI stack — from efficient data engineering to production-oriented deep learning systems.

🗂️ Portfolio Map

#	Area	Key Topics	Notebooks
1	📦 Data Loading & Augmentation	DataLoaders, transforms, Keras/PyTorch pipelines, memory vs. generator strategy	4
2	🧱 Deep Learning — CNN & NN	CNNs, image classification, multi-framework comparison, medical & anime datasets	10
3	🤖 Transformers & Vision Transformers	ViT, self-attention, BERT-style architectures, Keras & PyTorch implementations	3
4	🎮 Reinforcement Learning	Tabular Q-Learning, Deep Q-Networks (DQN), policy optimization	3
5	📝 NLP, RAG & Embeddings	LangChain, vector stores, RAG pipelines, watsonx embeddings, QA bots	5
6	👁️ Computer Vision & Multimodal AI	CNN-ViT hybrid integration, image captioning, satellite scene classification	3
7	🏆 Capstone Business Projects	End-to-end ML pipelines, game analytics, competitive feature engineering	1

Total: 29 notebooks across 7 skill domains

🛠️ Core Skills Demonstrated

Machine Learning & Deep Learning

CNNs for image classification across multiple domains (medical imaging, anime, fashion, MNIST)
Vision Transformers (ViT) — built from scratch and fine-tuned in both Keras and PyTorch
Multi-framework proficiency: side-by-side Keras (TensorFlow) and PyTorch implementations
Reinforcement Learning: from tabular Q-tables to Deep Q-Networks with experience replay

NLP & Generative AI

LangChain pipelines: document loaders, retrievers, vector stores
RAG (Retrieval-Augmented Generation): end-to-end QA systems
Embeddings: watsonx enterprise embedding API integration
QA chatbots: custom qabot.py with context-aware question answering

Data Engineering

Memory-efficient data loading: generator-based vs. memory-based strategies with benchmarks
Data augmentation: Keras ImageDataGenerator, PyTorch transforms pipelines
Custom Datasets: torch.utils.data.Dataset and DataLoader patterns

MLOps & Best Practices

Reproducible experiments with fixed seeds
Model evaluation, confusion matrices, and performance metrics
Feature engineering for real-world datasets

📁 Repository Structure

Python-Scripts/
├── 01_data_loading_augmentation/     # Data pipelines, transforms & augmentation
│   ├── README.md
│   └── *.ipynb (4 notebooks)
│
├── 02_deep_learning_cnn_nn/          # CNN & classical neural networks
│   ├── README.md
│   └── *.ipynb (10 notebooks)
│
├── 03_transformers_and_vit/          # Transformer & Vision Transformer architectures
│   ├── README.md
│   └── *.ipynb (3 notebooks)
│
├── 04_reinforcement_learning/        # RL algorithms & DQN agents
│   ├── README.md
│   └── *.ipynb (3 notebooks)
│
├── 05_nlp_rag_embeddings/            # NLP, RAG systems & embedding pipelines
│   ├── README.md
│   ├── qabot.py
│   └── *.ipynb (4 notebooks)
│
├── 06_computer_vision_multimodal/    # Advanced CV & multimodal models
│   ├── README.md
│   └── *.ipynb (3 notebooks)
│
├── 07_capstone_business_projects/    # End-to-end business analytics
│   ├── README.md
│   └── *.ipynb (1 notebook)
│
├── requirements.txt                  # All dependencies
├── portfolio_app.py                  # Streamlit interactive portfolio
└── README.md                         # This file

🚀 Quick Start

Prerequisites

Python 3.10-3.12 for full framework compatibility (including TensorFlow)
Python 3.13+ supported for PyTorch/LangChain workflows (TensorFlow skipped)
CUDA-capable GPU (recommended for DL notebooks)

Setup

# Clone the repository
git clone <your-repo-url>
cd Python-Scripts

# Create virtual environment
python -m venv .venv
.venv\Scripts\activate        # Windows
# source .venv/bin/activate   # macOS/Linux

# Install dependencies
pip install -r requirements.txt

# Launch Jupyter Lab
jupyter lab

# Or run the interactive portfolio app
streamlit run portfolio_app.py

🌟 Featured Projects

Project	Area	Highlight
League of Legends Match Predictor	Business ML	End-to-end pipeline, competitive game analytics
CNN-ViT Integration	CV + Transformers	Hybrid architecture for satellite classification
LangChain RAG System	NLP / GenAI	Full retrieval-augmented QA pipeline
Deep Q-Network	RL	Keras DQN with experience replay
Multi-Framework Classifier Comparison	DL	Keras vs PyTorch — same architecture, side-by-side

📬 Contact

LinkedIn: David Santillan
GitHub: DavidFSantillan
Kaggle: fdavidsantillan
Email: santilland333@gmail.com

Portfolio built with ❤️ using Python, Jupyter, PyTorch, TensorFlow, and LangChain.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🧠 Data Science Portfolio — Python-Scripts

🗂️ Portfolio Map

🛠️ Core Skills Demonstrated

Machine Learning & Deep Learning

NLP & Generative AI

Data Engineering

MLOps & Best Practices

📁 Repository Structure

🚀 Quick Start

Prerequisites

Setup

🌟 Featured Projects

📬 Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
01_data_loading_augmentation		01_data_loading_augmentation
02_deep_learning_cnn_nn		02_deep_learning_cnn_nn
03_transformers_and_vit		03_transformers_and_vit
04_reinforcement_learning		04_reinforcement_learning
05_nlp_rag_embeddings		05_nlp_rag_embeddings
06_computer_vision_multimodal		06_computer_vision_multimodal
07_capstone_business_projects		07_capstone_business_projects
.gitignore		.gitignore
README.md		README.md
portfolio_app.py		portfolio_app.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

🧠 Data Science Portfolio — Python-Scripts

🗂️ Portfolio Map

🛠️ Core Skills Demonstrated

Machine Learning & Deep Learning

NLP & Generative AI

Data Engineering

MLOps & Best Practices

📁 Repository Structure

🚀 Quick Start

Prerequisites

Setup

🌟 Featured Projects

📬 Contact

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages