Reinforcement Learning ยท Computer Vision ยท Large-Scale MLOps
๐ Portfolio โข ๐ CV โข ๐ผ LinkedIn โข โ๏ธ Email
I'm a Master's student in Computer Science (AI) at USC, with a BS from Sharif University of Technology. I build agents that learn under uncertainty and ship them on infrastructure that scales.
- ๐ญ Researching: adversarial co-evolution of RL and VLM/LLM agents
- ๐ ๏ธ Recently shipped: PPO agents for imperfect-information games, MoE steering at inference time, probing frameworks for speech transformers
- ๐ฑ Learning: ROS, control theory, advanced MLOps
- ๐ค Open to collaborate on: robotics simulation, medical imaging
- ๐ฌ Ask me about: PPO and offline RL, computer vision, MLOps pipelines on GCP/AWS
RL & Simulation ย Stable-Baselines3 ยท PettingZoo ยท Gymnasium ยท Ollama ยท vLLM
| Project | What it does | Stack |
|---|---|---|
| Risk-Scaled Steering in MoE | Token-aware steering for MoE LLMs โ 3D delta tensors that dynamically scale expert activations to improve safety at inference time. | vLLM PyTorch HF |
| Linguistic-Agnostic SER | Probing framework that measures how speech-emotion transformers encode paralinguistic vs. acoustic information across hidden layers. | PyTorch HF |
| Adversarial Co-Evolution | Trains PPO agents against LLM opponents in imperfect-information card games via curriculum learning and knowledge distillation. | PPO Ollama |
| Multi-Modal Sentiment Classification | Sentiment analysis over image-text conversations with time-dynamics exploration of multimodal cues. | PyTorch Pandas |
Replace the last row's link with the real repo URL โ the original pointed to a Google search.