Text mining and data mining are contrasted relative to automated prediction. Models are constructed by training on samples of unstructured documents, and results are projected to new text. A standard data format for input to prediction methods is described. The key objective of data preparation is to transform text into a numerical format, eventually sharing a common representation with numerical data mining. Different text-mining tasks are introduced that fit within a predictive framework for machine-learning. These include document classification, information retrieval, clustering of documents, information extraction, and performance evaluation.
CITATION STYLE
Weiss, S. M., Indurkhya, N., & Zhang, T. (2015). Overview of Text Mining (pp. 1–12). https://doi.org/10.1007/978-1-4471-6750-1_1
Mendeley helps you to discover research relevant for your work.