This app has been deployed! Please use deployment link to access: https://orbit-app-mt7to.ondigitalocean.app/
Due to rate limiting, access is private currently. Only persons with password are allowed at this time.
This application is deployed using Digital Ocean.
Orbit is an intelligent, personalized interview preparation system that leverages large language models and retrieval-augmented generation to simulate technical interviews, evaluate responses, and provide targeted feedback.
The system aims to make interview preparation more engaging and adaptive, with a focus on coding and system design interviews.
Job searching has become increasingly competitive, and candidates often struggle with limited, generic resources that do not adapt to their learning pace or strengths. While existing tools like LeetCode and mock interview services provide practice, they lack interactivity and personalized feedback loops.
Our project seeks to build an AI-powered assistant that:
- Conducts personalized mock interviews.
- Evaluates user responses for accuracy, clarity, and reasoning quality.
- Provides actionable, constructive feedback to guide future practice.
By integrating LLMs with retrieval systems and multimodal input (text, speech, and visuals), we can create a more natural, data-informed preparation experience that evolves with the user.
Summary:
SimInterview is a multilingual interview simulation system powered by LLMs. It integrates speech recognition, text generation, and visual avatars to simulate real interview experiences and adapts prompts to users’ resumes and target roles.
Strengths:
- Comprehensive multimodal design (speech + avatars).
- Tested across multiple roles and languages.
Limitations:
- Focuses primarily on business interviews.
- Lacks technical evaluation and coding-specific feedback.
Summary:
Introduces the HURIT dataset containing ~3,890 real HR interview transcripts. Evaluates pretrained LLMs for automated scoring, feedback generation, and error detection in HR interviews.
Strengths:
- Real-world data and human baselines.
- Rigorous zero-shot and few-shot evaluation.
Limitations:
- Focused on HR interviews rather than technical problem-solving.
- Feedback quality lacks precision in algorithmic reasoning contexts.
Summary:
A GPT-4-based AI interviewer that conducts candidate interviews and provides feedback using RAG-grounded evaluations aligned with real interview rubrics.
Strengths:
- Scalable feedback generation.
- Hybrid RAG + LLM framework for reduced hallucination.
Limitations:
- Does not assess code correctness or technical logic.
- Relies on human validation for reliability.
Our system consists of the following core components:
-
Fine-Tuned LLM Interviewer
Custom Gemini 2.5 Flash model fine-tuned on synthetic interview dialogues. Uses Socratic questioning to guide candidates without giving direct answers. -
Retrieval-Augmented Generation (RAG)
- DynamoDB: Stores 1800+ LeetCode problems with solutions and video tutorial transcripts
- Context Injection: Problem descriptions, hidden solution code, and transcript-derived hints injected into system prompts
- Real-time retrieval during interviews to ground responses in factual content
-
Real-Time Voice System
- LiveKit: WebRTC infrastructure for low-latency audio streaming
- Google Cloud STT: Streaming speech-to-text with automatic punctuation
- ElevenLabs TTS: Neural text-to-speech synthesis using the
eleven_turbo_v2_5model - Round-trip latency: ~3-4 seconds from speech to AI response
-
Interactive Coding Workspace
- Monaco Editor: VS Code-powered code editor with Python syntax highlighting
- Live Code Sync: User code streamed to AI interviewer via LiveKit data channel
- Context-aware feedback based on current code state
-
Full-Stack Web Application
- Frontend: Next.js with React, Tailwind CSS, LiveKit components
- Backend: FastAPI service with endpoint for LLM generation
- Worker: Node.js LiveKit worker orchestrating STT/TTS pipeline
- Deployment: Docker Compose for multi-service orchestration
| Category | Tools / Libraries |
|---|---|
| LLM & AI | Google Gemini 2.5 Flash (fine-tuned), Vertex AI |
| Backend | FastAPI, Node.js, Express |
| Frontend | Next.js 14, React, Tailwind CSS, Monaco Editor |
| Database | DynamoDB (AWS) |
| Real-Time Audio | LiveKit, Google Cloud STT, ElevenLabs TTS |
| Data Processing | Python, Pandas, YouTube Transcripts API |
| Deployment | Docker, Docker Compose |
| Version Control | Git, GitHub |
Kaggle
Contains 250 general technical questions suitable for evaluating base-level understanding of software engineering concepts.
Use Case:
Serves as a foundation for question generation and testing baseline model performance.
Kaggle
Contains 1,825 LeetCode problems with difficulty levels, topics, and acceptance rates.
Use Case:
Provides structured question data for fine-tuning generation models and analyzing patterns in problem selection.
Kaggle
Contains LeetCode problem solutions from the forum section of leetcode with identifiers like upvotes, views, and the solutions in python3
Use Case:
Provides structured solution data for fine-tuning generation models and analyzing user-inputted solutions
See Video_Processing.md for more info.
Pipeline:
- Transcript Extraction: YouTube auto-generated captions via YouTube Transcripts API
- Problem Mapping: Match videos to LeetCode problem IDs via title parsing
- Data Aggregation: Merge transcripts with LeetCode problem metadata and solution code
- Synthetic Dialogue Generation: Use Gemini 2.5 Flash to convert monologue transcripts into realistic interviewer-candidate conversations
Dataset Stats:
- 480+ videos from channels like NeetCode, TechLead
- 6.7MB CSV with problem transcripts
- 1825 unique problems mapped (30% transcript coverage)
- 220+ training dialogues generated for fine-tuning
Purpose:
Provides real-world pedagogical content for RAG. Transcript hints guide the AI interviewer on how to explain concepts effectively, mirroring high-quality YouTube tutorials.
| Metric | Description |
|---|---|
| Feedback Accuracy | Human evaluators assess model feedback against expert answers. |
| Question Relevance | Cosine similarity between generated and benchmark question sets. |
| User Improvement | Track user score trends and performance over time. |
| Response Coherence | Measure contextual consistency across multiple feedback turns. |
| Week | Focus | Planned Tasks |
|---|---|---|
| 1 (10/07–10/13) | Ideation | Finalize proposal, literature review, meet TA |
| 2 (10/14–10/20) | Data & Setup | Collect and preprocess datasets, begin EDA |
| 3 (10/21–10/27) | Model Baseline | Build static LLM chain for question generation and scoring |
| 4 (10/28–11/03) | Adaptive Feedback | Integrate evaluation loops, add personalized question selection |
| 5 (11/04–11/10) | UI Prototyping | Build Flask-based frontend and connect backend models |
| 6 (11/11–11/17) | Midpoint Demo | Internal presentation and debugging |
| 7 (11/18–11/24) | Evaluation | Implement feedback metrics and small user study |
| 8 (11/25–12/01) | Refinement | Optimize prompts, conduct ablation studies |
| 9 (12/02–12/08) | Visualization | Generate analysis plots and performance trends |
| 10 (12/09–12/15) | Finalization | Prepare report, demo, and code documentation |
TechnicalInterviewLLM/
├── LLM/ # FastAPI backend service
│ ├── app.py # Main API server (/chat endpoint)
│ ├── src/
│ │ ├── llm_client.py # Vertex AI Gemini client
│ │ └── problem_retriever.py # DynamoDB retrieval
│ ├── scripts/
│ │ ├── populate_db.py # Upload problems to DynamoDB
│ │ ├── generate_finetune_data.py # Create synthetic training dialogues
│ │ └── *.jsonl # Training data for fine-tuning
│ └── requirements.txt
│
├── livekit-worker/ # Node.js worker for voice system
│ ├── src/
│ │ └── index.ts # STT/TTS orchestration, session management
│ ├── package.json
│ └── dockerfile
│
├── interv-ai/ # Next.js web application
│ ├── app/
│ │ ├── practice/[id]/ # Interview session page
│ │ ├── components/
│ │ │ ├── CodeEditor.tsx # Monaco Editor integration
│ │ │ └── VoiceRecorder.tsx # Audio controls
│ │ └── api/
│ │ └── token/ # LiveKit JWT generation
│ ├── public/
│ └── package.json
│
├── transcripts/
│ └── video_problem_transcripts.csv # 6.7MB dataset (150+ videos)
│
├── video_pipeline/ # Data collection scripts
│ ├── pipelines/
│ │ ├── generate_transcripts.py # Extract YouTube captions
│ │ └── transcript_csv.py # Convert to CSV format
│ └── config/
│
├── docker-compose.yml # Multi-service orchestration
└── README.md- Python 3.9+
- Node.js 18+
- Docker & Docker Compose
- AWS account (DynamoDB)
- Google Cloud account (STT, Vertex AI)
- ElevenLabs account (TTS)
- LiveKit server (or LiveKit Cloud)
-
Set up environment variables:
# LLM/.env AWS_REGION=us-east-2 DYNAMODB_TABLE_NAME=Orbit_Interview_Questions GOOGLE_APPLICATION_CREDENTIALS=./service_account.json # livekit-worker/.env LIVEKIT_URL=ws://localhost:7880 LIVEKIT_API_KEY=your_key LIVEKIT_API_SECRET=your_secret GOOGLE_APPLICATION_CREDENTIALS=./keys/service_account.json ELEVENLABS_API_KEY=your_elevenlabs_api_key # interv-ai/.env NEXT_PUBLIC_LIVEKIT_URL=ws://localhost:7880 LIVEKIT_API_KEY=your_key LIVEKIT_API_SECRET=your_secret
-
Start all services:
docker-compose up
-
Access the app:
- Frontend: http://localhost:3000
- LLM API: http://localhost:8000
- LiveKit Worker: http://localhost:8080
- Nguyen, T. T. H., et al. (2025). SimInterview: Transforming Business Education through LLM-Based Simulated Multilingual Interview Training System. arXiv:2508.11873
- Maity, S., Deroy, A., & Sarkar, S. (2025). Towards Smarter Hiring: Are Zero-Shot and Few-Shot Pre-trained LLMs Ready for HR Spoken Interview Transcript Analysis? arXiv:2504.05683
- Yazdani, N., Mahajan, A., & Ansari, A. (2025). Zara: An LLM-Based Candidate Interview Feedback System. arXiv:2507.02869

