Objectives: This study aims to develop an efficient approach for parsing resumes and predicting job domains using natural language processing (NLP) techniques and named entity recognition to enhance the resume screening process for recruiters. Methods: The proposed approach involves preprocessing steps, such as cleaning, tokenization, stop-word removal, stemming, and lemmatization, implemented with the PyMuPDF and doc2text Python modules. Regular expressions and the spaCy library are utilized for entity recognition and name extraction. The model achieved a prediction accuracy of 92.08% and an F1-score of 0.92 on a dataset of 1000 resumes. An ablation experiment assessed the contributions of different factors. Findings: The approach demonstrated a high prediction accuracy of 92.08% and F1-score of 0.92 for job domain prediction, effectively identifying relevant job domains from resumes. Evaluations on individual job domains showed excellent precision and recall scores, validating its applicability. Preprocessing techniques significantly improved accuracy, while the integration of regular expressions and spaCy enhanced the model's performance. This approach automates resume screening, reducing recruiters' workload, saving time and effort, and improving candidate selection and the hiring process. Novelty: This study introduces a novel approach combining NLP techniques, regular expressions, and entity recognition for resume parsing and job domain prediction. This integration enhances accuracy and efficiency, offering a unique solution for resume screening.
CITATION STYLE
Sinha, A. K., Akhtar, M. A. K., & Kumar, M. (2023). Automated Resume Parsing and Job Domain Prediction using Machine Learning. Indian Journal Of Science And Technology, 16(26), 1967–1974. https://doi.org/10.17485/ijst/v16i26.880
Mendeley helps you to discover research relevant for your work.