Background: Self-report multiple choice questionnaires have been widely utilized to quantitatively measure one’s personality and psychological constructs. Despite several strengths (e.g., brevity and utility), self-report multiple choice questionnaires have considerable limitations in nature. With the rise of machine learning (ML) and Natural language processing (NLP), researchers in the field of psychology are widely adopting NLP to assess psychological construct to predict human behaviors. However, there is a lack of connections between the work being performed in computer science and that of psychology due to small data sets and unvalidated modeling practices. Aims: The current article introduces the study method and procedure of phase II which includes the interview questions for the five-factor model (FFM) of personality developed in phase I. This study aims to develop the interview (semi-structured) and open-ended questions for the FFM-based personality assessments, specifically designed with experts in the field of clinical and personality psychology (phase 1), and to collect the personality-related text data using the interview questions and self-report measures on personality and psychological distress (phase 2). The purpose of the study includes examining the relationship between natural language data obtained from the interview questions, measuring the FFM personality constructs, and psychological distress to demonstrate the validity of the natural language-based personality prediction. Methods: Phase I (pilot) study was conducted to fifty-nine native Korean adults to acquire the personality-related text data from the interview (semi-structured) and open-ended questions based on the FFM of personality. The interview questions were revised and finalized with the feedback from the external expert committee, consisting of personality and clinical psychologists. Based on the established interview questions, a total of 300 Korean adults will be recruited using a convenience sampling method via online survey. The text data collected from interviews will be analyzed using the natural language processing. The results of the online survey including demographic data, depression, anxiety, and personality inventories will be analyzed together in the model to predict individuals’ FFM of personality and the level of psychological distress (phase 2).
CITATION STYLE
Jang, J., Yoon, S., Son, G., Kang, M., Choeh, J. Y., & Choi, K. H. (2022). Predicting Personality and Psychological Distress Using Natural Language Processing: A Study Protocol. Frontiers in Psychology, 13. https://doi.org/10.3389/fpsyg.2022.865541
Mendeley helps you to discover research relevant for your work.