Linguistic measures of chemical diversity and the “keywords” of molecular collections

  • Woźniak M
  • Wołos A
  • Modrzyk U
  • et al.
N/ACitations
Citations of this article
23Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Computerized linguistic analyses have proven of immense value in comparing and searching through large text collections (“corpora”), including those deposited on the Internet – indeed, it would nowadays be hard to imagine browsing the Web without, for instance, search algorithms extracting most appropriate keywords from documents. This paper describes how such corpus-linguistic concepts can be extended to chemistry based on characteristic “chemical words” that span more than traditional functional groups and, instead, look at common structural fragments molecules share. Using these words, it is possible to quantify the diversity of chemical collections/databases in new ways and to define molecular “keywords” by which such collections are best characterized and annotated.

Cite

CITATION STYLE

APA

Woźniak, M., Wołos, A., Modrzyk, U., Górski, R. L., Winkowski, J., Bajczyk, M., … Eder, M. (2018). Linguistic measures of chemical diversity and the “keywords” of molecular collections. Scientific Reports, 8(1). https://doi.org/10.1038/s41598-018-25440-6

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free