Chemlistem: Chemical named entity recognition using recurrent neural networks

27Citations
Citations of this article
61Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Chemical named entity recognition (NER) has traditionally been dominated by conditional random fields (CRF)-based approaches but given the success of the artificial neural network techniques known as "deep learning" we decided to examine them as an alternative to CRFs. We present here several chemical named entity recognition systems. The first system translates the traditional CRF-based idioms into a deep learning framework, using rich per-token features and neural word embeddings, and producing a sequence of tags using bidirectional long short term memory (LSTM) networks-a type of recurrent neural net. The second system eschews the rich feature set-and even tokenisation-in favour of character labelling using neural character embeddings and multiple LSTM layers. The third system is an ensemble that combines the results of the first two systems. Our original BioCreative V.5 competition entry was placed in the top group with the highest F scores, and subsequent using transfer learning have achieved a final F score of 90.33% on the test data (precision 91.47%, recall 89.21%).

Cite

CITATION STYLE

APA

Corbett, P., & Boyle, J. (2018). Chemlistem: Chemical named entity recognition using recurrent neural networks. Journal of Cheminformatics, 10(1). https://doi.org/10.1186/s13321-018-0313-8

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free