Abstract
Motivation: The motivation is to identify, through machine learning techniques, specific patterns in HIV and HCV viral polyprotein amino acid residues where viral protease cleaves the polyprotein as it leaves the ribosome. An understanding of viral protease specificity may help the development of future anti-viral drugs involving protease inhibitors by identifying specific features of protease activity for further experimental investigation. While viral sequence information is growing at a fast rate, there is still comparatively little understanding of how viral polyproteins are cut into their functional unit lengths. The aim of the work reported here is to investigate whether it is possible to generalise from known cleavage sites to unknown cleavage sites for two specific viruses-HIV and HCV. An understanding of proteolytic activity for specific viruses will contribute to our understanding of viral protease function in general, thereby leading to a greater understanding of protease families and their substrate characteristics. Results: Our results show that artificial neural networks and symbolic learning techniques (See5) capture some fundamental and new substrate attributes, but neural networks outperform their symbolic counterpart. Availability: Publicly available software was used (Stuttgart Neural Network Simulator-http://www-ra. informatik.uni-tuebingen.de/SNNS/, and See5-http://www.rulequest.com). The datasets used (HIV, HCV) for See5 are available at: http://www.dcs.ex.ac.uk/ anarayan/bioinf/ismbdatasets/. © Oxford University Press 2002. © Oxford University Press 2001.
Author supplied keywords
Cite
CITATION STYLE
Narayanan, A., Wu, X., & Yang, Z. R. (2002). Mining viral protease data to extract cleavage knowledge. Bioinformatics, 18, S5–S13. https://doi.org/10.1093/bioinformatics/18.suppl_1.S5
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.