The development of high-throughput methods for genome interrogation, such as microarrays and next-generation sequencing (NGS) has led to significant progress in the field of cancer genomics. However, the major obstacle to the extraction of knowledge is the fragmentation of the oncogenomics datasets, sourcing from the various genomic data repositories, into a multitude of different file formats and data types. An important advance in cancer genomics is represented by The Cancer Genome Atlas (TCGA) project, a coordinated effort to provide a comprehensive catalog of biomedical data about cancer. Although several tools for querying and analyzing TCGA data are developed and publicly available, they are often hard to use and the integration across data sets and data types remains limited. Moreover, researchers who want to combine these heterogeneous data often are forced to use several complementary tools lacking of interoperability. The contribution of this survey is double: to provide the researchers with an overview of the main technical and functional features of the most popular and innovative tools for querying and analyzing TCGA data and in addition to make available an easy to use guideline that helps the researchers in the choice of the tools best suited to their needs, hence focusing their efforts on the research goals rather than on the technical issues.
Settino, M., & Cannataro, M. (2019). Survey of main tools for querying and analyzing TCGA Data. In Proceedings - 2018 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2018 (pp. 1711–1718). Institute of Electrical and Electronics Engineers Inc. https://doi.org/10.1109/BIBM.2018.8621270