Using automated classifiers to code discourse data enables researchers to carry out analyses on large datasets. This paper presents a detailed example of applying training, validation and test sets frequently utilized in machine learning to develop automated classifiers for use in quantitative ethnography research. The method was applied to two dispositional constructs. Within one cycle of the process, reliable and valid automated classifiers were developed for Social Disposition. However, the automated coding scheme for Inclusive Disposition was rejected during the validation stage due to issues of overfitting. Nonetheless, the results demonstrate the beneficial potential of using preclassified datasets in enhancing the efficiency and effectiveness of the automation process.
CITATION STYLE
Lee, S. B., Gui, X., Manquen, M., & Hamilton, E. R. (2019). Use of Training, Validation, and Test Sets for Developing Automated Classifiers in Quantitative Ethnography. In Communications in Computer and Information Science (Vol. 1112, pp. 117–127). Springer. https://doi.org/10.1007/978-3-030-33232-7_10
Mendeley helps you to discover research relevant for your work.