Artificial Intelligence (AI) has appeared as a life-changing innovation in recent years transforming the conventional problem-solving strategies adopted so far. ML and DL-based approaches are making a monumental impact in the fields of life sciences and health care. The tremendous amount of biochemical data has set off leading-edge research in health care and Drug Discovery. Molecular Machine Learning has precisely adopted ML techniques to uncover new insights from biochemical data. Biochemical datasets essentially hold text-based sequential information about molecules in several forms. Simplified Molecular Input Line Entry System (SMILES) is a highly efficient format for representing biochemical data that can be suitably utilized for countless relevant applications. This work presents the SMILES molecular representation in a nutshell and is centered on the major applications of ML and DL in health care especially in the drug discovery process using SMILES. This work utilizes a sequence-to-sequence architecture built on Recurrent Neural Networks (RNNs) for generating small drug-like molecules using the benchmark datasets. The experimental results prove that the Long Short Term Memory (LSTM) based RNNs can be trained to encode the raw SMILES strings with nearly perfect accuracy and to generate similar molecular structures with minimal or no feature engineering. The gradient-based optimization strategy is applied to the network and found distinctly suited to assemble the most stable and proficient sequence model. RNNs can thus be employed in Drug Discovery activities like similarity-based virtual screening, lead compound finding, and hit-to-lead optimization.
CITATION STYLE
Kotkondawar, R. R., Sutar, S. R., Kiwelekar, A. W., & Wankhede, H. S. (2023). Performability of Deep Recurrent Neural Networks for Molecular Sequence data. International Journal of Computing and Digital Systems, 13(1), 1317–1327. https://doi.org/10.12785/ijcds/1301107
Mendeley helps you to discover research relevant for your work.