Abstract
The Gene Ontology (GO) knowledge base provides a standardized vocabulary of GO terms for describing gene functions and attributes. It consists of three directed acyclic graphs which represent the hierarchical structure of relationships between GO terms. GO terms enable the organization of genes based on their functional attributes by annotating genes to specific GO terms. We propose an information-retrieval derived distance between genes by using their annotations. Four gene sets with causal associations were examined by employing our proposed methodology. As a result, the discovered homogeneous subsets of these gene sets are semantically related, in contrast to comparable works. The relevance of the found clusters can be described with the help of ChatGPT by asking for their biological meaning. The R package BIDistances, readily available on CRAN, empowers researchers to effortlessly calculate the distance for any given gene set.
Author supplied keywords
Cite
CITATION STYLE
Stier, Q., & Thrun, M. C. (2023). Deriving Homogeneous Subsets from Gene Sets by Exploiting the Gene Ontology. Informatica (Netherlands), 34(2), 357–386. https://doi.org/10.15388/23-INFOR517
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.