On Geodesic Distances and Contextual Embedding Compression for Text Classification

2Citations
Citations of this article
55Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In some memory-constrained settings like IoT devices and over-the-network data pipelines, it can be advantageous to have smaller contextual embeddings. We investigate the efficacy of projecting contextual embedding data (BERT) onto a manifold, and using nonlinear dimensionality reduction techniques to compress these embeddings. In particular, we propose a novel post-processing approach, applying a combination of Isomap and PCA. We find that the geodesic distance estimations, estimates of the shortest path on a Riemannian manifold, from Isomap's k-Nearest Neighbors graph bolstered the performance of the compressed embeddings to be comparable to the original BERT embeddings. On one dataset, we find that despite a 12-fold dimensionality reduction, the compressed embeddings performed within 0.1% of the original BERT embeddings on a downstream classification task. In addition, we find that this approach works particularly well on tasks reliant on syntactic data, when compared with linear dimensionality reduction. These results show promise for a novel geometric approach to achieve lower dimensional text embeddings from existing transformers and pave the way for dataspecific and application-specific embedding compressions.

Cite

CITATION STYLE

APA

Jha, R., & Mihata, K. (2021). On Geodesic Distances and Contextual Embedding Compression for Text Classification. In TextGraphs 2021 - Graph-Based Methods for Natural Language Processing, Proceedings of the 15th Workshop - in conjunction with the 2021 Annual Conference of the North American Chapter of the Association for Computational Linguistics, NAACL 2021 (pp. 144–149). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.textgraphs-1.15

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free