Powerful new methods, like expression profiles using cDNA arrays, have been used to monitor changes in gene expression levels as a result of a variety of metabolic, xenobiotic or pathogenic challenges. This potentially vast quantity of data enables, in principle, the dissection of the complex genetic networks that control the patterns and rhythms of gene expression in the cell. Here we present a general approach to developing dynamic models for analyzing time series of whole genome expression. In this approach, a self-consistent calculation is performed that involves both linear and non-linear response terms for interrelating gene expression levels. This calculation uses singular value decomposition (SVD) not as a statistical tool but as a means of inverting noisy and near-singular matrices. The linear transition matrix that is determined from this calculation can be used to calculate the underlying network reflected in the data. This suggests a direct method of classifying genes according to their place in the resulting network. In addition to providing a means to model such a large multivariate system this approach can be used to reduce the dimensionality of the problem in a rational and consistent way, and suppress the strong noise amplification effects often encountered with expression profile data. Non-linear and higher-order Markov behavior of the network are also determined in this self-consistent method. In data sets from yeast, we calculate the Markov matrix and the gene classes based on the linear-Markov network. These results compare favorably with previously used methods like cluster analysis. Our dynamic method appears to give a broad and general framework for data analysis and modeling of gene expression arrays.
Mendeley saves you time finding and organizing research
Choose a citation style from the tabs below