Shiva Jyoti Shiva210Jyoti

About Me

I'm a third-year B.Tech Computer Science (Data Science) student at VIT Chennai (CGPA 9.23/10), where research and engineering meet on my keyboard every day. My world sits at the intersection of machine learning research, full-stack engineering, and cloud-native systems — I publish papers and ship production code.

I authored a Springer peer-reviewed paper on AI-driven lung cancer prognosis (ICT4SD 2025), where I designed the Clinical Readiness Score (CRS) — a novel evaluation framework using the Analytic Hierarchy Process (AHP) to weigh interpretability, efficiency, clinical validation, and accuracy across CNNs, RNNs, RF, and SVMs on the LIDC-IDRI and TCGA datasets.

When I'm not in research mode, I architect AI-powered systems like LifeMemory AI (RAG + LangGraph + pgvector) and cloud-native platforms on AWS — and lead operations as Financial Lead at ACM-W VIT, having run 4 hackathons for 250–300+ participants each.

🔬 Currently: building LifeMemory AI — a privacy-first journal with multi-step LangGraph reasoning
🌱 Learning: advanced distributed systems, retrieval-augmented architectures, MLOps
🎯 Open to: research collaborations, ML/SWE internships, open-source contributions
💬 Ask me about: RAG pipelines, AHP-based evaluation, AWS 3-tier architectures, pgvector
⚡ Fun fact: I once analyzed ~4.8M UIDAI records and turned them into actionable district-level policy insights

🔬 Research ・ ⚡ Engineering ・ ☁️ Cloud ・ 📊 Data Science ・ 🧠 LLM Systems

GitHub Activity Stats

Tech Stack

💻 Languages

🌐 Frontend & Backend

🧠 AI / ML / Data

🗄️ Databases

☁️ Cloud & DevOps

🛠️ Tools & Platforms

📚 Core Concepts

Featured Projects

Four projects spanning AI-powered systems, cloud-native platforms, and data-science research at scale.

🧠 LifeMemory AI — AI-Powered Personal Memory System

A privacy-first journaling platform that uses Retrieval-Augmented Generation to help users explore their own memories like a conversation with their past self.

🧬 RAG pipeline combining semantic search, metadata filtering, and temporal prioritization
🕸️ Multi-step LangGraph reasoning — intent classification → retrieval → synthesis
⚡ Async FastAPI backend with PostgreSQL + pgvector for vector similarity at scale
🔐 JWT + Supabase Auth + Row-Level Security (RLS) — privacy by design, not afterthought
🐳 Docker-deployed with structured logging and monitoring

📚 BookShelf — Social AI-Powered Book Platform

A full-stack social book platform with hybrid AI recommendations and real-time community features.

🎯 Hybrid recommendation engine built on PostgreSQL + pgvector (content + collaborative)
💬 Real-time chat via WebSockets — book clubs that actually talk
📖 Open Library API integration for a dynamic, ever-growing catalog
🔐 JWT auth via Supabase with secure token refresh
🧱 Zustand state management + Axios interceptors for clean, resilient client-server contracts

☁️ CloudCollab — Cloud-Native Real-Time Collaborative Coding Platform

A 3-tier AWS-native real-time pair-programming platform with live video and multi-language code execution.

🏛️ 3-tier AWS architecture — S3 (static) + CloudFront (CDN) + Elastic Beanstalk (compute)
🗃️ DynamoDB with GSI/LSI indexing for high-throughput, low-latency reads
🎥 WebRTC video conferencing + collaborative code editor with operational transforms
🔐 JWT + Role-Based Access Control (RBAC) — owner / editor / viewer permissions
🌍 Multi-language code execution via secure external sandbox APIs
📈 AWS CloudWatch monitoring with custom metrics and alarms

📊 UIDAI Service Stress Zone Analysis

District-level stress-zone analytics on ~4.8M Aadhaar enrolment + update records with a custom metric and policy framework.

📦 Cleaned and analyzed ~4.8M records across enrolment + update datasets
🧮 Designed a novel Service Stress Ratio metric for district-level diagnosis
📈 Performed univariate, bivariate, and trivariate statistical analysis
🌲 Random Forest classifier for stress-level categorization
🏛️ Built a policy recommendation framework for resource allocation
🔁 Trend & persistence analysis — separating systemic issues from transient spikes

Research & Publications

📝 Improving Lung Cancer Prognosis Through Data Science
Shiva Jyoti, Samriddhi Ganguly, B. Sri Soumya, S. Nachiyappan
ICT4SD 2025 · Lecture Notes in Networks and Systems, vol. 1652 · Springer, Cham · Oct 31, 2025
DOI: 10.1007/978-3-032-06691-6_9

This work introduces the Clinical Readiness Score (CRS) — a structured, multi-criteria evaluation metric for AI models in lung cancer diagnosis. CRS combines interpretability, efficiency, clinical validation, and accuracy, with weights assigned via the Analytic Hierarchy Process (AHP) and validated for consistency. We benchmarked CNNs, RNNs, Random Forest, and SVM on the LIDC-IDRI and TCGA datasets, evaluating AUC, F1-score, sensitivity, and specificity, and provide AHP weight distributions, sensitivity analysis, and CRS factor contributions to support clinical adoption.

📐 Methodology Flow

flowchart LR
    A[LIDC-IDRI / TCGA<br/>Datasets] --> B[Preprocessing<br/>+ SMOTE]
    B --> C{Model Training}
    C --> D[CNN]
    C --> E[RNN]
    C --> F[Random Forest]
    C --> G[SVM]
    D & E & F & G --> H[Metrics<br/>AUC · F1 · Sensitivity · Specificity]
    H --> I[AHP Weighting<br/>Interpretability · Efficiency<br/>Clinical Validation · Accuracy]
    I --> J{{Clinical Readiness Score<br/>CRS}}
    J --> K[Sensitivity Analysis<br/>+ Recommendations]

Coding Profiles

Education Timeline

📅 Year	🏫 Institution	🎓 Degree / Board
2023 – Present	Vellore Institute of Technology, Chennai	B.Tech, Computer Science (Data Science)
2022	Pratap World School	CBSE — Class XII
2020	Indian Heritage School	ICSE — Class X

Leadership & Activities

Jul 2025 – Present

Financial Lead — ACM-W, VIT Chennai

🏆 Led 4 large-scale hackathons, each with 250–300+ participants
💰 Owned end-to-end budget & operations for events serving 400+ participants
🤝 Coordinated with sponsors and cross-functional student teams
📊 Built financial trackers and post-event reports for transparency & audit
🌐 Strengthened the women-in-computing community at VIT through inclusive programming

🏆 Hackathons Led	👥 Participants Served	💼 Sponsor Partnerships	📅 Tenure
4	400+	Multiple	Active

Contribution Snake

🔁 The snake auto-regenerates daily and on every push via the GitHub Actions workflow at .github/workflows/snake.yml.

Thanks for visiting!

"Build with research-grade rigor. Ship with engineering speed."

Provide feedback

Saved searches

Use saved searches to filter your results more quickly