Network-based feature screening with applications to genome data

Mengyun Wu; Liping Zhu; Xingdong Feng

Journal ArticleOPEN ACCESS

Network-based feature screening with applications to genome data

Annals of Applied Statistics (2018) 12(2) 1250-1270

DOI: 10.1214/17-AOAS1097

7Citations

7Readers

Abstract

Modern biological techniques have led to various types of data, which are often used to identify important biomarkers for certain diseases with appropriate statistical methods, such as feature screening. Model-free feature screening has been extensively studied in the literature, and it is effective to select useful predictors for ultra-high dimensional data. These existing screening procedures are conducted based on certain marginal correlations between predictors and a response variable, therefore network structures connecting the predictors are usually ignored. Google’s PageRank algorithm has achieved remarkable success. We adopt its spirit to adjust original screening approaches by incorporating the network information. We can then significantly improve the performance of those screening methods in choosing useful biomarkers, which is demonstrated in an intensive simulation study. A couple of real genome datasets along with a biological network are further analyzed by comparing results on both accuracy of predicting responses and stability of identifying biomarkers.

Author supplied keywords

References Powered by Scopus

View more at Scopus

Cited by Powered by Scopus

View more at Scopus

Cite

CITATION STYLE

APA

Wu, M., Zhu, L., & Feng, X. (2018). Network-based feature screening with applications to genome data. Annals of Applied Statistics, 12(2), 1250–1270. https://doi.org/10.1214/17-AOAS1097

Readers' Seniority

PhD / Post grad / Masters / Doc 3

75%

Professor / Associate Prof. 1

25%

Readers' Discipline

Mathematics 3

100%

Network-based feature screening with applications to genome data

Abstract

Author supplied keywords

References Powered by Scopus

Regression Shrinkage and Selection Via the Lasso

Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources

Regularization and variable selection via the elastic net

Cited by Powered by Scopus

Network-adjusted Kendall’s Tau Measure for Feature Screening with Application to High-dimensional Survival Genomic Data

Graph-based sparse linear discriminant analysis for high-dimensional classification