The rapid growth of biomedical literature is evident in the increasing size of the MEDLINE research database. Medical Subject Headings (MeSH), a controlled set of keywords, are used to index all the citations contained in the database to facilitate search and retrieval. This volume of citations calls for efficient tools to assist indexers at the US National Library of Medicine (NLM). Currently, the Medical Text Indexer (MTI) system provides assistance by recommending MeSH terms based on the title and abstract of an article using a combination of distributional and vocabulary-based methods. In this paper, we evaluate a novel approach toward indexer assistance by using nearest neighbor classification in combination with Reflective Random Indexing (RRI), a scalable alternative to the established methods of distributional semantics. On a test set provided by the NLM, our approach significantly outperforms the MTI system, suggesting that the RRI approach would make a useful addition to the current methodologies. © 2010 Elsevier Inc.
Vasuki, V., & Cohen, T. (2010). Reflective random indexing for semi-automatic indexing of the biomedical literature. Journal of Biomedical Informatics, 43(5), 694–700. https://doi.org/10.1016/j.jbi.2010.04.001