Information enhancement methods for large scale sequence analysis

  • Claverie J
  • States D
  • 5


    Mendeley users who have this article in their library.
  • 102


    Citations of this article.


The improved efficiency of similarity search programs and the affordability of even faster computers allow studies where whole sequence databases can be the target of various comparisons with increasingly larger or numerous query sequences. However, the usefulness of those "brute force" methods now becomes limited by the time it takes an experienced scientist to sift the biologically relevant matches from overwhelming, albeit "statistically significant" outputs. The discrepancy between statistical vs biological significance has different causes: erroneous database entries, repetitive sequence elements, and the ubiquity of low complexity segments with biased composition. We present two masking methods (programs XNU and XBLAST) capable of eliminating most of the irrelevant outputs in a variety of large scale sequence analysis situations: global "all against all" database comparisons, massive partial cDNA sequencing (EST), positional cloning and genomic data analysis. © 1993.

Get free article suggestions today

Mendeley saves you time finding and organizing research

Sign up here
Already have an account ?Sign in

Find this document


  • Jean Michel Claverie

  • David J. States

Cite this document

Choose a citation style from the tabs below

Save time finding and organizing research with Mendeley

Sign up for free