A multivariate representation and analysis of DNA sequence data

Jörgen Jonsson; Lennart Eriksson; Sven Hellberg; Fredrik Lindgren; Michael Sjöström; Svante Wold

Journal ArticleOPEN ACCESS

A multivariate representation and analysis of DNA sequence data

Acta Chemica Scandinavica (1991) 45 186-192

DOI: 10.3891/acta.chem.scand.45-0186

17Citations

8Readers

Abstract

A new way to represent and analyze DNA sequence data is described. This approach complements methods currently used, in that it allows the systematic part of the variation between different sequences to be modeled. This can prove as informative as absence of variation (homology), which is the most widely used criterion for comparing sequence data. A multivariate sequence-activity model (SAM), for DNA-promoter sequences is presented, by which the relative promoter strength is modeled in terms of the primary DNA-sequence. The model is shown to have a good predictive capability. The coefficients from the model are interpreted, and used to design new structures predicted to be strong promoters in the system investigated. The approach described is also applicable to other kinds of sequence data, e.g. RNAs, proteins or peptides.

Cite

CITATION STYLE

APA

Jonsson, J., Eriksson, L., Hellberg, S., Lindgren, F., Sjöström, M., & Wold, S. (1991). A multivariate representation and analysis of DNA sequence data. Acta Chemica Scandinavica, 45, 186–192. https://doi.org/10.3891/acta.chem.scand.45-0186

A multivariate representation and analysis of DNA sequence data

Abstract

Cite

Register to see more suggestions