Study of various methods for tokenization

18Citations
Citations of this article
42Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Tokenization is the mechanism of splitting or fragmenting the sentences and words to its possible smallest morpheme called as token. Morpheme is smallest possible word after which it cannot be broken further. As the tokenization is initial phase and as well very crucial phase of Part-Of-Speech (POS) tagging in Natural Language Processing (NLP). Tokenization could be sentence level and word level. This paper analyzes the possible tokenization methods that can be applied to tokenize the word efficiently.

Cite

CITATION STYLE

APA

Rai, A., & Borah, S. (2021). Study of various methods for tokenization. In Lecture Notes in Networks and Systems (Vol. 137, pp. 193–200). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-981-15-6198-6_18

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free