This is a project I did to familiarize myself with the basic concepts of machine learning. It seeks to be able to predict how good an applicant is for a job offer based on their resume.
Data set: The data schema is based on a job description, a resume, and an associated score. The model seeks to predict the score given the previous elements.
To create the data set I used the "Kaggle Resume Dataset". This data set is made up of "category" and "resume", so I had to modify the data schema.
Explanation of the files:
-generateDataSet: Creates the dataSet.csv file based on the "Kaggle Resume Dataset", a set of job descriptions defined by me, and a function that calculates a score based on a resume and a job description.
-auxiliary: Contains the auxiliary function to calculate scores and the set of job descriptions used to generate the dataSet.
-trainModel: File where the model is created and trained once the dataSet is obtained.
-UI.py: Simply provides a user interface to obtain a job description.
-predictor.py: Main executable file, asks for a job description, a resume in pdf format and makes the prediction of how adequate the summary is, returning a numerical result between 0 and 10.
-resumes: Data set from "Kaggle Resume Dataset".
-dataSet: Data set created by generateDataSet, from which the model is fed.
Test the model: To perform a test, run predictor.py