Patterns Versus Characters in Subword-Aware Neural Language Modeling

Rustem Takhanov; Zhenisbek Assylbekov

Conference Proceedings

Patterns Versus Characters in Subword-Aware Neural Language Modeling

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2017) 10635 LNCS 157-166

DOI: 10.1007/978-3-319-70096-0_17

3Citations

6Readers

Get full text

Abstract

Words in some natural languages can have a composite structure. Elements of this structure include the root (that could also be composite), prefixes and suffixes with which various nuances and relations to other words can be expressed. Thus, in order to build a proper word representation one must take into account its internal structure. From a corpus of texts we extract a set of frequent subwords and from the latter set we select patterns, i.e. subwords which encapsulate information on character n-gram regularities. The selection is made using the pattern-based Conditional Random Field model [19, 23] with l1 regularization. Further, for every word we construct a new sequence over an alphabet of patterns. The new alphabet’s symbols confine a local statistical context stronger than the characters, therefore they allow better representations in Rn and are better building blocks for word representation. In the task of subword-aware language modeling, pattern-based models outperform character-based analogues by 2–20 perplexity points. Also, a recurrent neural network in which a word is represented as a sum of embeddings of its patterns is on par with a competitive and significantly more sophisticated character-based convolutional architecture.

Author supplied keywords

Cite

CITATION STYLE

APA

Takhanov, R., & Assylbekov, Z. (2017). Patterns Versus Characters in Subword-Aware Neural Language Modeling. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10635 LNCS, pp. 157–166). Springer Verlag. https://doi.org/10.1007/978-3-319-70096-0_17

Patterns Versus Characters in Subword-Aware Neural Language Modeling

Abstract

Author supplied keywords

Cite

Register to see more suggestions