Gene Ontology density estimation and discourse analysis for automatic GeneRiF extraction

Julien Gobeill; Imad Tbahriti; Frédéric Ehrler; Anaïs Mottaz; Anne Lise Veuthey; Patrick Ruch

Conference ProceedingsOPEN ACCESS

Gene Ontology density estimation and discourse analysis for automatic GeneRiF extraction

BMC Bioinformatics (2008) 9(SUPPL. 3)

DOI: 10.1186/1471-2105-9-S3-S9

13Citations

24Readers

Abstract

Background: This paper describes and evaluates a sentence selection engine that extracts a GeneRiF (Gene Reference into Functions) as defined in ENTREZ-Gene based on a MEDLINE record. Inputs for this task include both a gene and a pointer to a MEDLINE reference. In the suggested approach we merge two independent sentence extraction strategies. The first proposed strategy (LASt) uses argumentative features, inspired by discourse-analysis models. The second extraction scheme (GOEx) uses an automatic text categorizer to estimate the density of Gene Ontology categories in every sentence; thus providing a full ranking of all possible candidate GeneRiFs. A combination of the two approaches is proposed, which also aims at reducing the size of the selected segment by filtering out non-content bearing rhetorical phrases. Results: Based on the TREC-2003 Genomics collection for GeneRiF identification, the LASt extraction strategy is already competitive (52.78%). When used in a combined approach, the extraction task clearly shows improvement, achieving a Dice score of over 57% (+10%). Conclusions: Argumentative representation levels and conceptual density estimation using Gene Ontology contents appear complementary for functional annotation in proteomics. © 2008 Gobeill et al.; licensee BioMed Central Ltd.

Cite

CITATION STYLE

APA

Gobeill, J., Tbahriti, I., Ehrler, F., Mottaz, A., Veuthey, A. L., & Ruch, P. (2008). Gene Ontology density estimation and discourse analysis for automatic GeneRiF extraction. In BMC Bioinformatics (Vol. 9). https://doi.org/10.1186/1471-2105-9-S3-S9

Gene Ontology density estimation and discourse analysis for automatic GeneRiF extraction

Abstract

Cite

Register to see more suggestions