Use of Training, Validation, and Test Sets for Developing Automated Classifiers in Quantitative Ethnography

Seung B. Lee; Xiaofan Gui; Megan Manquen; Eric R. Hamilton

Conference Proceedings

Use of Training, Validation, and Test Sets for Developing Automated Classifiers in Quantitative Ethnography

Communications in Computer and Information Science (2019) 1112 117-127

DOI: 10.1007/978-3-030-33232-7_10

8Citations

16Readers

Get full text

Abstract

Using automated classifiers to code discourse data enables researchers to carry out analyses on large datasets. This paper presents a detailed example of applying training, validation and test sets frequently utilized in machine learning to develop automated classifiers for use in quantitative ethnography research. The method was applied to two dispositional constructs. Within one cycle of the process, reliable and valid automated classifiers were developed for Social Disposition. However, the automated coding scheme for Inclusive Disposition was rejected during the validation stage due to issues of overfitting. Nonetheless, the results demonstrate the beneficial potential of using preclassified datasets in enhancing the efficiency and effectiveness of the automation process.

Author supplied keywords

Cite

CITATION STYLE

APA

Lee, S. B., Gui, X., Manquen, M., & Hamilton, E. R. (2019). Use of Training, Validation, and Test Sets for Developing Automated Classifiers in Quantitative Ethnography. In Communications in Computer and Information Science (Vol. 1112, pp. 117–127). Springer. https://doi.org/10.1007/978-3-030-33232-7_10

Use of Training, Validation, and Test Sets for Developing Automated Classifiers in Quantitative Ethnography

Abstract

Author supplied keywords

Cite

Register to see more suggestions