Skip to content

theamalsebastian/AdvancedJob_Search

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

46 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

AI Job Search Assistant πŸ”

A full-stack, RAG-powered career platform that helps job seekers find relevant openings, understand how their skills stack up, and check whether their resume will survive an ATS (Applicant Tracking System) scan β€” all through a natural-language chat interface.

πŸ”— Live demo: https://advanced-job-search.vercel.app (Note: backend runs on a free-tier host and may take ~30s to wake up on first request)


What it does

  • Chat-based job search β€” ask "What jobs match my Python and ML skills?" and get a grounded answer plus matching job cards, generated via a RAG pipeline over live job postings.
  • Resume parsing β€” upload a PDF resume; the app extracts 200+ technical skills across 7 categories, estimates years of experience, and pulls contact info.
  • ATS scoring β€” get a 0–100 score across five dimensions (formatting, section completeness, action verbs, quantified achievements, keyword match against a job description) with specific, actionable suggestions.
  • Live job board β€” semantic search over indexed postings, refreshed on demand from a free job-board API.
  • Analytics dashboard β€” in-demand skills, job source breakdown, and search activity over time, computed from real Postgres data.

Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Next.js 14 + TypeScript     β”‚   Vercel
β”‚   Tailwind CSS v4              β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                 β”‚ REST (JSON)
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚        FastAPI backend         β”‚   Render (Docker)
β”‚  β”œβ”€ /api/chat   (RAG)          β”‚
β”‚  β”œβ”€ /api/jobs   (search/scrape)β”‚
β”‚  β”œβ”€ /api/resume (parsing)      β”‚
β”‚  β”œβ”€ /api/ats    (scoring)      β”‚
β”‚  └─ /api/analytics             β”‚
β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
       β”‚              β”‚
β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Qdrant      β”‚ β”‚  PostgreSQL    β”‚
β”‚  (vectors)   β”‚ β”‚  (Neon)        β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
       β”‚
β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Groq API        β”‚
β”‚  (Llama 3.3-70B) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Tech stack

Frontend

  • Next.js 14 (App Router) + TypeScript
  • Tailwind CSS v4
  • Recharts (analytics charts)
  • lucide-react (icons)

Backend

  • FastAPI + Pydantic
  • SQLAlchemy ORM
  • Groq API (Llama 3.3-70B) for RAG responses
  • Qdrant for vector search
  • FastEmbed (BAAI/bge-small-en-v1.5) for embeddings β€” chosen for its small footprint to fit free-tier memory limits
  • pdfplumber for resume PDF parsing
  • BeautifulSoup for job description cleanup

Data & infrastructure

  • PostgreSQL via Neon (serverless Postgres)
  • Qdrant Cloud (vector database)
  • Docker (backend containerization)
  • Render (backend hosting)
  • Vercel (frontend hosting)
  • Arbeitnow public API (job postings source)

Project structure

.
β”œβ”€β”€ backend/
β”‚   β”œβ”€β”€ app/
β”‚   β”‚   β”œβ”€β”€ main.py              # FastAPI app, CORS, startup hooks
β”‚   β”‚   β”œβ”€β”€ config.py            # env-based settings
β”‚   β”‚   β”œβ”€β”€ db/
β”‚   β”‚   β”‚   β”œβ”€β”€ database.py      # SQLAlchemy engine/session
β”‚   β”‚   β”‚   └── models.py        # Job, Resume, ATSScore, SearchLog
β”‚   β”‚   β”œβ”€β”€ routers/
β”‚   β”‚   β”‚   β”œβ”€β”€ jobs.py          # scrape, search, list, stats
β”‚   β”‚   β”‚   β”œβ”€β”€ resume.py        # upload, parse
β”‚   β”‚   β”‚   β”œβ”€β”€ ats.py           # ATS scoring
β”‚   β”‚   β”‚   β”œβ”€β”€ chat.py          # RAG chat endpoint
β”‚   β”‚   β”‚   └── analytics.py     # dashboard aggregates
β”‚   β”‚   β”œβ”€β”€ services/
β”‚   β”‚   β”‚   β”œβ”€β”€ scraper.py       # Arbeitnow job scraper
β”‚   β”‚   β”‚   β”œβ”€β”€ embedder.py      # resume parsing + skill ontology
β”‚   β”‚   β”‚   β”œβ”€β”€ ats_scorer.py    # ATS scoring logic
β”‚   β”‚   β”‚   β”œβ”€β”€ reranker.py      # cross-encoder reranking (optional)
β”‚   β”‚   β”‚   └── rag.py           # RAG orchestration (Groq + Qdrant)
β”‚   β”‚   └── vectorstore/
β”‚   β”‚       └── qdrant_client.py # embeddings + vector search
β”‚   β”œβ”€β”€ requirements.txt
β”‚   └── Dockerfile
β”‚
└── frontend/
    β”œβ”€β”€ src/
    β”‚   β”œβ”€β”€ app/
    β”‚   β”‚   β”œβ”€β”€ page.tsx          # chat page
    β”‚   β”‚   β”œβ”€β”€ jobs/page.tsx     # job board
    β”‚   β”‚   β”œβ”€β”€ resume/page.tsx   # resume upload + ATS
    β”‚   β”‚   β”œβ”€β”€ analytics/page.tsx
    β”‚   β”‚   └── globals.css
    β”‚   β”œβ”€β”€ components/
    β”‚   β”‚   β”œβ”€β”€ Navbar.tsx
    β”‚   β”‚   β”œβ”€β”€ ChatWindow.tsx
    β”‚   β”‚   β”œβ”€β”€ JobCard.tsx
    β”‚   β”‚   β”œβ”€β”€ ResumeUpload.tsx
    β”‚   β”‚   β”œβ”€β”€ ATSScoreCard.tsx
    β”‚   β”‚   └── AnalyticsDashboard.tsx
    β”‚   └── lib/api.ts            # typed API client
    β”œβ”€β”€ tailwind.config.js
    └── postcss.config.mjs

API overview

Endpoint Method Description
/api/chat POST RAG chat β€” semantic search + LLM-generated answer
/api/jobs/scrape POST Scrape job postings and index into Qdrant + Postgres
/api/jobs/search POST Semantic/hybrid search over indexed jobs
/api/jobs/list GET Browse indexed jobs
/api/jobs/stats GET Index + database stats
/api/resume/upload POST Upload and parse a resume PDF
/api/resume/{id} GET Retrieve a parsed resume profile
/api/ats/score POST Run ATS scoring against an optional job description
/api/analytics GET Aggregate stats for the dashboard

Full interactive API docs available at <backend-url>/docs (Swagger UI).


Running locally

Prerequisites

Backend

cd backend
python -m venv venv
source venv/bin/activate      # Windows: venv\Scripts\activate
pip install -r requirements.txt

# create .env with:
# DATABASE_URL=postgresql://...
# QDRANT_URL=https://...
# QDRANT_API_KEY=...
# GROQ_API_KEY=gsk_...
# FRONTEND_URL=http://localhost:3000

uvicorn app.main:app --reload

Backend runs at http://localhost:8000 β€” Swagger docs at /docs.

Frontend

cd frontend
npm install

# create .env.local with:
# NEXT_PUBLIC_API_URL=http://localhost:8000

npm run dev

Frontend runs at http://localhost:3000.

Seeding job data

With the backend running, call POST /api/jobs/scrape via /docs with a body like:

{
  "queries": ["python", "developer", "engineer"],
  "location": "",
  "max_per_query": 15
}

Deployment

  • Frontend: deployed on Vercel, root directory frontend, env var NEXT_PUBLIC_API_URL pointing at the backend.
  • Backend: deployed on Render as a Docker web service, root directory backend. Environment variables: DATABASE_URL, QDRANT_URL, QDRANT_API_KEY, GROQ_API_KEY, FRONTEND_URL.
  • Vector DB: Qdrant Cloud free-tier cluster.
  • Database: Neon serverless Postgres free tier.

Notes on free-tier constraints

  • Render's free tier (512MB RAM) cannot run sentence-transformers/torch β€” embeddings use FastEmbed with an ONNX-based model instead.
  • Hybrid (keyword + vector) search and cross-encoder reranking are implemented but disabled by default in production to stay within memory limits; pure semantic search is used instead.
  • The backend spins down after ~15 minutes of inactivity on Render's free tier β€” first request after idle may take 30-60 seconds.

Key design decisions

  • FastEmbed over sentence-transformers: drops the torch dependency entirely, fitting comfortably within 512MB RAM while keeping a 384-dimension embedding model (BAAI/bge-small-en-v1.5).
  • Skill ontology-based parsing: rather than relying solely on an LLM for resume parsing, a curated dictionary of 200+ skills across 7 categories (languages, ML/AI, frameworks, data, backend, cloud/DevOps, tools) enables fast, deterministic, and free skill extraction.
  • Weighted ATS scoring: the five scoring dimensions are combined with configurable weights, and keyword matching dynamically re-weights when a job description is provided versus when it isn't.
  • Decoupled architecture: frontend and backend are independently deployable services communicating over a typed REST API, mirroring real-world microservice patterns.

Future improvements

  • Re-enable hybrid search + reranking on a higher-memory tier
  • Add authentication and per-user saved searches / resume history
  • Scheduled job-index refresh via GitHub Actions
  • LinkedIn job source integration
  • Cover letter generation from resume + job description

License

MIT

About

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors