Skip to content

122 extract user info from cv#133

Merged
BhuvanArn merged 22 commits into
stagingfrom
122-extract-user-info-from-cv
Jun 24, 2026
Merged

122 extract user info from cv#133
BhuvanArn merged 22 commits into
stagingfrom
122-extract-user-info-from-cv

Conversation

@eregine

@eregine eregine commented May 15, 2026

Copy link
Copy Markdown
Collaborator

What type of PR is this? (check all applicable)

  • ✨ Feature
  • 🛑 Bug
  • ⚠️ Anomaly
  • 📝 Doc
  • 🎨 Style
  • 🧑‍💻 Refactor
  • 🛠️ Setup
  • 🏗️ Build
  • 🔥 Perfs
  • ✅ Test
  • 🔁 CI
  • ⏩ Revert

Description

Add the feature to upload the user cv in the Backend side .

The Backend will receive the user's CV from the Frontend. Then it will parse it, find all important user's information as the desired job, experiences, technical skills, and the user's degree.

Linked GitHub Ticket

Closes EpitechPromo2027/G-EIP-600-NAN-6-1-eip-tugdual.de-reviers#122

Workspace

  • 🖥️ Web
  • 🛠️ Server
  • 🔁 CI
  • 🤖 Ai
  • 📱 App

Screenshots

@eregine eregine self-assigned this May 15, 2026
@vercel

vercel Bot commented May 15, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
talk-up-ai-dev Ready Ready Preview, Comment Jun 24, 2026 2:09am

@railway-app

railway-app Bot commented May 15, 2026

Copy link
Copy Markdown

🚅 Deployed to the TalkUp.AI-pr-133 environment in talk-up-ai

Service Status Web Updated (UTC)
Backend ✅ Success (View Logs) Jun 24, 2026 at 2:10 am

@railway-app railway-app Bot temporarily deployed to talk-up-ai / TalkUp.AI-pr-133 May 15, 2026 09:27 Destroyed

@BhuvanArn BhuvanArn left a comment

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review

The coverage gate genuinely fails (branch coverage ~71% < 80% → unit-tests.yml red), the SSRF in uploadJobOffer is real, multer.config.ts is dead code, and the Bruno collection + GROQ_API_KEY docs are missing.

I've pushed fixes for all blockers + safe cleanups (see commit). Summary of what changed:

Blockers

  • 🔴 SSRF: added common/utils/urlGuard.ts (isSafeFetchUrl) — rejects non-http(s) schemes and private/loopback/link-local/CGNAT targets incl. cloud metadata 169.254.169.254 and IPv4-mapped IPv6 (::ffff:…, which new URL() normalizes to hex — a bypass I caught while writing the guard's tests). Wired in before any scraping. Residual risk: DNS-rebinding (public host → private IP) still needs resolve-then-pin at fetch time — flagged as follow-up in the file.
  • 🔴 Coverage gate: added real specs for JobOfferExtraction (axios/puppeteer mocked, retry/fallback/short-content branches) + urlGuard, plus sparse-field & truncation cases for the service. Branch coverage now 81.45% > 80%, 566 tests green.
  • 🔴 Dead multer.config.ts: deleted (zero refs; controller uses FileInterceptor).
  • 🔴 Bruno: added users/uploadCV.bru + users/uploadJobOffer.bru + folder.
  • 🔴 GROQ_API_KEY: documented in server/.env.example.

Security / reliability

  • @Throttle({ limit: 5, ttl: 60000 }) on both upload routes (expensive: PDF parse + LLM + scraping).
  • CV text now truncated to 8000 chars before the LLM (matches the job-offer path).
  • Puppeteer executablePath: "/usr/bin/google-chrome" removed → use bundled Chromium (override via PUPPETEER_EXECUTABLE_PATH).

Consistency / cleanup

  • Top-level ESM imports for groq-sdk / pdf-parse-debugging-disabled (+ a .d.ts shim so the import is typed under noImplicitAny); Groq client is now a single injected-field instance, not per-request.
  • console.log/error → Nest Logger throughout.
  • missions column type: "json"simple-array to match sibling array fields.
  • Removed unused Put import; added missing Swagger decorators on uploadJobOffer.

Deferred: the @Req/@Res + (req as any).userId@CurrentUser()/@UploadedFile()/return-DTO refactor and a shared extractWithGroq<T> helper. That's the real simplification (~290 dup lines → ~120) but it rewrites the service spec...

Comment thread server/src/modules/users/users.service.ts Outdated
Comment thread server/src/modules/users/users.service.ts Outdated
Comment thread server/src/modules/users/users.service.ts Outdated
Comment thread server/src/modules/users/users.controller.ts Outdated
Comment thread server/src/common/middleware/multer.config.ts Outdated
Comment thread server/src/common/utils/JobOfferExtraction.ts Outdated
Comment thread server/src/entities/userJobOffer.entity.ts Outdated
…fer upload

- add isSafeFetchUrl guard (scheme + private/loopback/link-local/ipv4-mapped-ipv6) before job-offer scraping
- delete unused multer.config.ts (controller uses FileInterceptor)
- top-level groq-sdk/pdf-parse imports + singleton groq client + type shim
- throttle upload routes, truncate cv text to 8000 chars, fix empty-check null-deref
- nest Logger over console, drop puppeteer hardcoded chrome path, missions simple-array
- add bruno files + GROQ_API_KEY to .env.example
- add JobOfferExtraction/urlGuard specs + sparse-field/ssrf cases (branch cov 81.45%)
Follow-up to the review of the initial upload implementation.

- controller uses @CurrentUser/@UploadedFile/@Body(UploadJobOfferDto) and
  returns a value instead of raw @Req/@res; drops the (req as any).userId casts
- service throws BadRequest/InternalServerError instead of res.status().json()
- extract shared extractWithGroq<T>() + generic upsertByUser() helpers,
  collapsing the ~95% duplication between uploadCV and uploadJobOffer
- service methods take (userId, file|url) and return { message }
- rewrite specs to the new signatures; branch coverage 84%
@railway-app railway-app Bot temporarily deployed to talk-up-ai / TalkUp.AI-pr-133 June 21, 2026 06:15 Destroyed
Resolve add/add conflict in server/.env.example (keep both AI_SERVER_URL and
GROQ_API_KEY blocks).

Also fix the Railway deploy failure: the Groq client was instantiated as a
class field, so a missing GROQ_API_KEY threw at app bootstrap and crashed the
container. Build it lazily on first use instead — boot no longer depends on the
key; the upload routes surface a 500 if it is unset.
@railway-app railway-app Bot temporarily deployed to talk-up-ai / TalkUp.AI-pr-133 June 24, 2026 02:09 Destroyed
@BhuvanArn BhuvanArn self-requested a review June 24, 2026 03:22
@BhuvanArn BhuvanArn merged commit 051c822 into staging Jun 24, 2026
10 checks passed
@BhuvanArn BhuvanArn deleted the 122-extract-user-info-from-cv branch June 24, 2026 06:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants