Text Mining 101

Sarah Sutton; Kelly Swickard

Journal ArticleOPEN ACCESS

Text Mining 101

Serials Librarian (2020) 78(1-4) 3-8

DOI: 10.1080/0361526X.2020.1715775

0Citations

37Readers

Abstract

Dr. Sarah Sutton, who is an instructor of library and information science, walked attendees of this NASIG preconference through the history of text mining and larger implications of its usage. Sutton used Google NGrams (Google N-Grams Tool) as a way to ease into the interactive portion of the workshop. The tools and sources used were the HathiTrust (Bibliographic data and API), HathiTrust Analytics (Research Center), and PythonAnywhere. With these tools, attendees were able to learn and utilize some basic Python commands. There was a discussion on screen scraping and ethical (even polite) methods. Further discussion focused on preparing the data through methods such as chunking, grouping, tokenization, and then analyzing the text. Lastly, the workshop examined data visualization, and Sutton gave many examples of visualization types one can employ.

Author supplied keywords

Cite

CITATION STYLE

APA

Sutton, S., & Swickard, K. (2020). Text Mining 101. Serials Librarian, 78(1–4), 3–8. https://doi.org/10.1080/0361526X.2020.1715775

Text Mining 101

Abstract

Author supplied keywords

Cite

Register to see more suggestions