Tokenization in the Theory of Knowledge

Robert Friedman

Journal ArticleOPEN ACCESS

Tokenization in the Theory of Knowledge

Friedman R

Encyclopedia (2023) 3(1) 380-386

DOI: 10.3390/encyclopedia3010024

N/ACitations

57Readers

Abstract

Tokenization is a procedure for recovering the elements of interest in a sequence of data. This term is commonly used to describe an initial step in the processing of programming languages, and also for the preparation of input data in the case of artificial neural networks; however, it is a generalizable concept that applies to reducing a complex form to its basic elements, whether in the context of computer science or in natural processes. In this entry, the general concept of a token and its attributes are defined, along with its role in different contexts, such as deep learning methods. Included here are suggestions for further theoretical and empirical analysis of tokenization, particularly regarding its use in deep learning, as it is a rate-limiting step and a possible bottleneck when the results do not meet expectations.

Cite

CITATION STYLE

APA

Friedman, R. (2023). Tokenization in the Theory of Knowledge. Encyclopedia, 3(1), 380–386. https://doi.org/10.3390/encyclopedia3010024

Tokenization in the Theory of Knowledge

Abstract

Cite

Register to see more suggestions