122 extract user info from cv#133
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
🚅 Deployed to the TalkUp.AI-pr-133 environment in talk-up-ai
|
…act-usefull-information-from-the-opening-job
There was a problem hiding this comment.
Review
The coverage gate genuinely fails (branch coverage ~71% < 80% → unit-tests.yml red), the SSRF in uploadJobOffer is real, multer.config.ts is dead code, and the Bruno collection + GROQ_API_KEY docs are missing.
I've pushed fixes for all blockers + safe cleanups (see commit). Summary of what changed:
Blockers
- 🔴 SSRF: added
common/utils/urlGuard.ts(isSafeFetchUrl) — rejects non-http(s) schemes and private/loopback/link-local/CGNAT targets incl. cloud metadata169.254.169.254and IPv4-mapped IPv6 (::ffff:…, whichnew URL()normalizes to hex — a bypass I caught while writing the guard's tests). Wired in before any scraping. Residual risk: DNS-rebinding (public host → private IP) still needs resolve-then-pin at fetch time — flagged as follow-up in the file. - 🔴 Coverage gate: added real specs for
JobOfferExtraction(axios/puppeteer mocked, retry/fallback/short-content branches) +urlGuard, plus sparse-field & truncation cases for the service. Branch coverage now 81.45% > 80%, 566 tests green. - 🔴 Dead
multer.config.ts: deleted (zero refs; controller usesFileInterceptor). - 🔴 Bruno: added
users/uploadCV.bru+users/uploadJobOffer.bru+ folder. - 🔴
GROQ_API_KEY: documented inserver/.env.example.
Security / reliability
@Throttle({ limit: 5, ttl: 60000 })on both upload routes (expensive: PDF parse + LLM + scraping).- CV text now truncated to 8000 chars before the LLM (matches the job-offer path).
- Puppeteer
executablePath: "/usr/bin/google-chrome"removed → use bundled Chromium (override viaPUPPETEER_EXECUTABLE_PATH).
Consistency / cleanup
- Top-level ESM imports for
groq-sdk/pdf-parse-debugging-disabled(+ a.d.tsshim so the import is typed undernoImplicitAny); Groq client is now a single injected-field instance, not per-request. console.log/error→ NestLoggerthroughout.missionscolumntype: "json"→simple-arrayto match sibling array fields.- Removed unused
Putimport; added missing Swagger decorators onuploadJobOffer.
Deferred: the @Req/@Res + (req as any).userId → @CurrentUser()/@UploadedFile()/return-DTO refactor and a shared extractWithGroq<T> helper. That's the real simplification (~290 dup lines → ~120) but it rewrites the service spec...
…fer upload - add isSafeFetchUrl guard (scheme + private/loopback/link-local/ipv4-mapped-ipv6) before job-offer scraping - delete unused multer.config.ts (controller uses FileInterceptor) - top-level groq-sdk/pdf-parse imports + singleton groq client + type shim - throttle upload routes, truncate cv text to 8000 chars, fix empty-check null-deref - nest Logger over console, drop puppeteer hardcoded chrome path, missions simple-array - add bruno files + GROQ_API_KEY to .env.example - add JobOfferExtraction/urlGuard specs + sparse-field/ssrf cases (branch cov 81.45%)
Follow-up to the review of the initial upload implementation. - controller uses @CurrentUser/@UploadedFile/@Body(UploadJobOfferDto) and returns a value instead of raw @Req/@res; drops the (req as any).userId casts - service throws BadRequest/InternalServerError instead of res.status().json() - extract shared extractWithGroq<T>() + generic upsertByUser() helpers, collapsing the ~95% duplication between uploadCV and uploadJobOffer - service methods take (userId, file|url) and return { message } - rewrite specs to the new signatures; branch coverage 84%
Resolve add/add conflict in server/.env.example (keep both AI_SERVER_URL and GROQ_API_KEY blocks). Also fix the Railway deploy failure: the Groq client was instantiated as a class field, so a missing GROQ_API_KEY threw at app bootstrap and crashed the container. Build it lazily on first use instead — boot no longer depends on the key; the upload routes surface a 500 if it is unset.
322e0d4 to
122c14d
Compare
What type of PR is this? (check all applicable)
Description
Add the feature to upload the user cv in the Backend side .
The Backend will receive the user's CV from the Frontend. Then it will parse it, find all important user's information as the desired job, experiences, technical skills, and the user's degree.
Linked GitHub Ticket
Closes EpitechPromo2027/G-EIP-600-NAN-6-1-eip-tugdual.de-reviers#122
Workspace
Screenshots