122 extract user info from cv by eregine · Pull Request #133 · BhuvanArn/TalkUp.AI

eregine · 2026-05-15T09:27:37Z

What type of PR is this? (check all applicable)

Description

Add the feature to upload the user cv in the Backend side .

The Backend will receive the user's CV from the Frontend. Then it will parse it, find all important user's information as the desired job, experiences, technical skills, and the user's degree.

Linked GitHub Ticket

Closes EpitechPromo2027/G-EIP-600-NAN-6-1-eip-tugdual.de-reviers#122

Workspace

Screenshots

… the claude AI

vercel · 2026-05-15T09:27:43Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
talk-up-ai-dev	Ready	Preview, Comment	Jun 24, 2026 2:09am

railway-app · 2026-05-15T09:27:46Z

🚅 Deployed to the TalkUp.AI-pr-133 environment in talk-up-ai

Service	Status	Web	Updated (UTC)
Backend	✅ Success (View Logs)		Jun 24, 2026 at 2:10 am

…act-usefull-information-from-the-opening-job

BhuvanArn

Review

The coverage gate genuinely fails (branch coverage ~71% < 80% → unit-tests.yml red), the SSRF in uploadJobOffer is real, multer.config.ts is dead code, and the Bruno collection + GROQ_API_KEY docs are missing.

I've pushed fixes for all blockers + safe cleanups (see commit). Summary of what changed:

Blockers

🔴 SSRF: added common/utils/urlGuard.ts (isSafeFetchUrl) — rejects non-http(s) schemes and private/loopback/link-local/CGNAT targets incl. cloud metadata 169.254.169.254 and IPv4-mapped IPv6 (::ffff:…, which new URL() normalizes to hex — a bypass I caught while writing the guard's tests). Wired in before any scraping. Residual risk: DNS-rebinding (public host → private IP) still needs resolve-then-pin at fetch time — flagged as follow-up in the file.
🔴 Coverage gate: added real specs for JobOfferExtraction (axios/puppeteer mocked, retry/fallback/short-content branches) + urlGuard, plus sparse-field & truncation cases for the service. Branch coverage now 81.45% > 80%, 566 tests green.
🔴 Dead multer.config.ts: deleted (zero refs; controller uses FileInterceptor).
🔴 Bruno: added users/uploadCV.bru + users/uploadJobOffer.bru + folder.
🔴 GROQ_API_KEY: documented in server/.env.example.

Security / reliability

@Throttle({ limit: 5, ttl: 60000 }) on both upload routes (expensive: PDF parse + LLM + scraping).
CV text now truncated to 8000 chars before the LLM (matches the job-offer path).
Puppeteer executablePath: "/usr/bin/google-chrome" removed → use bundled Chromium (override via PUPPETEER_EXECUTABLE_PATH).

Consistency / cleanup

Top-level ESM imports for groq-sdk / pdf-parse-debugging-disabled (+ a .d.ts shim so the import is typed under noImplicitAny); Groq client is now a single injected-field instance, not per-request.
console.log/error → Nest Logger throughout.
missions column type: "json" → simple-array to match sibling array fields.
Removed unused Put import; added missing Swagger decorators on uploadJobOffer.

Deferred: the @Req/@Res + (req as any).userId → @CurrentUser()/@UploadedFile()/return-DTO refactor and a shared extractWithGroq<T> helper. That's the real simplification (~290 dup lines → ~120) but it rewrites the service spec...

…fer upload - add isSafeFetchUrl guard (scheme + private/loopback/link-local/ipv4-mapped-ipv6) before job-offer scraping - delete unused multer.config.ts (controller uses FileInterceptor) - top-level groq-sdk/pdf-parse imports + singleton groq client + type shim - throttle upload routes, truncate cv text to 8000 chars, fix empty-check null-deref - nest Logger over console, drop puppeteer hardcoded chrome path, missions simple-array - add bruno files + GROQ_API_KEY to .env.example - add JobOfferExtraction/urlGuard specs + sparse-field/ssrf cases (branch cov 81.45%)

@Body

Follow-up to the review of the initial upload implementation. - controller uses @CurrentUser/@UploadedFile/@Body(UploadJobOfferDto) and returns a value instead of raw @Req/@res; drops the (req as any).userId casts - service throws BadRequest/InternalServerError instead of res.status().json() - extract shared extractWithGroq<T>() + generic upsertByUser() helpers, collapsing the ~95% duplication between uploadCV and uploadJobOffer - service methods take (userId, file|url) and return { message } - rewrite specs to the new signatures; branch coverage 84%

Resolve add/add conflict in server/.env.example (keep both AI_SERVER_URL and GROQ_API_KEY blocks). Also fix the Railway deploy failure: the Groq client was instantiated as a class field, so a missing GROQ_API_KEY threw at app bootstrap and crashed the container. Build it lazily on first use instead — boot no longer depends on the key; the upload routes surface a 500 if it is unset.

…load

eregine added 10 commits March 13, 2026 19:06

Add the data base for the user_cv by creating a new entity

6a7ad05

Modify some files to respect prettier style

3edd889

Add the route to upload the user's cv from yhe frontend

aa5925d

Add the extraction of the user's information on the cv and send it to…

1a7c960

… the claude AI

Merge the stagging branch

3f660e1

Add the start of changing the IA for sorting the user CV

049c630

add the package-lock of all the project

9f4639a

Add the end of the upload user's cv feature

7bdb061

Merge the stagging branch

66ccf9e

Add some change for the coding style

5d51a34

eregine requested review from Andriamanampisoa and BhuvanArn May 15, 2026 09:27

eregine self-assigned this May 15, 2026

railway-app Bot temporarily deployed to talk-up-ai / TalkUp.AI-pr-133 May 15, 2026 09:27 Destroyed

eregine added 6 commits May 15, 2026 20:03

Add the extration of the job offer from linkedin, using axios and pepeer

b797ea0

Add more structure to the code for the job offer scrapping

921da14

Cleaning the code by pretttier

b56d8ae

Merge branch 'staging' of github.com:Tugduoff/TalkUp.AI into 124-extr…

585f60c

…act-usefull-information-from-the-opening-job

Add new unity test in the server for the job offer

b76af56

Add some unity test for the upload cv part

618669a

railway-app Bot temporarily deployed to talk-up-ai / TalkUp.AI-pr-133 May 31, 2026 08:54 Destroyed

vercel Bot deployed to Preview – talk-up-ai-dev May 31, 2026 08:54 View deployment

BhuvanArn requested changes Jun 21, 2026

View reviewed changes

railway-app Bot temporarily deployed to talk-up-ai / TalkUp.AI-pr-133 June 21, 2026 05:44 Destroyed

vercel Bot deployed to Preview – talk-up-ai-dev June 21, 2026 05:44 View deployment

railway-app Bot temporarily deployed to talk-up-ai / TalkUp.AI-pr-133 June 21, 2026 06:15 Destroyed

vercel Bot deployed to Preview – talk-up-ai-dev June 21, 2026 06:16 View deployment

railway-app Bot temporarily deployed to talk-up-ai / TalkUp.AI-pr-133 June 21, 2026 06:38 Destroyed

vercel Bot deployed to Preview – talk-up-ai-dev June 21, 2026 06:39 View deployment

railway-app Bot temporarily deployed to talk-up-ai / TalkUp.AI-pr-133 June 23, 2026 13:55 Destroyed

fix: close SSRF redirect bypass and drop dead deps on cv/job-offer up…

122c14d

…load

BhuvanArn force-pushed the 122-extract-user-info-from-cv branch from 322e0d4 to 122c14d Compare June 23, 2026 14:01

railway-app Bot temporarily deployed to talk-up-ai / TalkUp.AI-pr-133 June 23, 2026 14:01 Destroyed

vercel Bot deployed to Preview – talk-up-ai-dev June 23, 2026 14:01 View deployment

refactor: drop puppeteer from job-offer scraping

0efbdc7

railway-app Bot temporarily deployed to talk-up-ai / TalkUp.AI-pr-133 June 23, 2026 16:04 Destroyed

vercel Bot deployed to Preview – talk-up-ai-dev June 23, 2026 16:04 View deployment

fix: close IPv6 SSRF bypass and enforce one-per-user cv/job-offer rows

1a3f1b3

railway-app Bot temporarily deployed to talk-up-ai / TalkUp.AI-pr-133 June 24, 2026 02:09 Destroyed

vercel Bot deployed to Preview – talk-up-ai-dev June 24, 2026 02:09 View deployment

BhuvanArn self-requested a review June 24, 2026 03:22

BhuvanArn approved these changes Jun 24, 2026

View reviewed changes

BhuvanArn merged commit 051c822 into staging Jun 24, 2026
10 checks passed

BhuvanArn deleted the 122-extract-user-info-from-cv branch June 24, 2026 06:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

122 extract user info from cv#133

122 extract user info from cv#133
BhuvanArn merged 22 commits into
stagingfrom
122-extract-user-info-from-cv

eregine commented May 15, 2026 •

edited

Loading

Uh oh!

vercel Bot commented May 15, 2026 •

edited

Loading

Uh oh!

railway-app Bot commented May 15, 2026 •

edited

Loading

Uh oh!

BhuvanArn left a comment •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

eregine commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What type of PR is this? (check all applicable)

Description

Linked GitHub Ticket

Workspace

Screenshots

Uh oh!

vercel Bot commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

railway-app Bot commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

BhuvanArn left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

eregine commented May 15, 2026 •

edited

Loading

vercel Bot commented May 15, 2026 •

edited

Loading

railway-app Bot commented May 15, 2026 •

edited

Loading

BhuvanArn left a comment •

edited

Loading