The prediction of protein secondary structure continues to be an active area of research in bioinformatics. In this paper, a Bi-LSTM based ensemble model is developed for the prediction of protein secondary structure. The ensemble model with dual loss function consists of five sub-models, which are finally joined by a Bi-LSTM layer. In contrast to existing ensemble methods, which generally train each sub-model and then join them as a whole, this ensemble model and sub-models can be trained simultaneously and the performance of each model can be observed and compared during the training process. Three independent test sets (e.g., data1199, 513 protein Cuff & Barton set (CB513) and 203 proteins from Critical Appraisals Skills Programme (CASP203)) are employed to test the method. On average, the ensemble model achieved 84.3% in Q3 accuracy and 81.9% in segment overlap measure (SOV) score by using 10-fold cross validation. There is an improvement of up to 1% over some state-of-the-art prediction methods of protein secondary structure.
CITATION STYLE
Hu, H., Li, Z., Elofsson, A., & Xie, S. (2019). A Bi-LSTM based ensemble algorithm for prediction of protein secondary structure. Applied Sciences (Switzerland), 9(17). https://doi.org/10.3390/app9173538
Mendeley helps you to discover research relevant for your work.