Topological metrics in blast data mining: Plasmid and nitrogen-fixing proteins case studies

0Citations
Citations of this article
3Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Over the past years, a number of metrics have been introduced to characterize the topology of complex networks. We use these methodologies to analyze networks obtained through Blast data mining. The algorithm we present consists of the following steps: 1- encode results of Blast searches as a distance matrix of e-values; 2- perform entropycontrolled clustering analysis to identify the communities; 3- statistical analysis of the resulting network, 4- gene ontology and data mining in sequence databases to infer the function of the identified clusters. We report on the analysis of two data sets; the first is formed by over 3300 plasmid encoded proteins and the second comprises over 4200 sequences related to nitrogen fixation proteins. In the first case we observed strong selective pressures for horizontal transfer and maintenance of genes encoding proteins for resistance to antibiotics, plasmid stability and conjugal transfer. Nitrogen fixation proteins can be divided on the basis of our results into three different groups: proteins with no paralogs in any of the genomes considered, proteins with paralogs belonging to different metabolic processes (O-paralogs) and proteins with paralogs in other and the same metabolic processes (I/O-paralogs). © Springer-Verlag Berlin Heidelberg 2008.

Cite

CITATION STYLE

APA

Lió, P., Brilli, M., & Fani, R. (2008). Topological metrics in blast data mining: Plasmid and nitrogen-fixing proteins case studies. Communications in Computer and Information Science, 13, 207–220. https://doi.org/10.1007/978-3-540-70600-7_16

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free