In this paper, we present our experiments with BERT models in the task of Large-scale Multi-label Text Classification (LMTC). In the LMTC task, each text document can have multiple class labels, while the total number of classes is in the order of thousands. We propose a pooling layer architecture on top of BERT models, which improves the quality of classification by using information from the standard [CLS] token in combination with pooled sequence output. We demonstrate the improvements on Wikipedia datasets in three different languages using public pre-trained BERT models.
CITATION STYLE
Lehečka, J., Švec, J., Ircing, P., & Šmídl, L. (2020). Adjusting bert’s pooling layer for large-scale multi-label text classification. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12284 LNAI, pp. 214–221). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-58323-1_23
Mendeley helps you to discover research relevant for your work.