Abstract
Dr. Sarah Sutton, who is an instructor of library and information science, walked attendees of this NASIG preconference through the history of text mining and larger implications of its usage. Sutton used Google NGrams (Google N-Grams Tool) as a way to ease into the interactive portion of the workshop. The tools and sources used were the HathiTrust (Bibliographic data and API), HathiTrust Analytics (Research Center), and PythonAnywhere. With these tools, attendees were able to learn and utilize some basic Python commands. There was a discussion on screen scraping and ethical (even polite) methods. Further discussion focused on preparing the data through methods such as chunking, grouping, tokenization, and then analyzing the text. Lastly, the workshop examined data visualization, and Sutton gave many examples of visualization types one can employ.
Author supplied keywords
Cite
CITATION STYLE
Sutton, S., & Swickard, K. (2020). Text Mining 101. Serials Librarian, 78(1–4), 3–8. https://doi.org/10.1080/0361526X.2020.1715775
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.