Stretch coding and block coding: Two new strategies to represent questionably aligned DNA sequences

  • Geiger D
  • 42


    Mendeley users who have this article in their library.
  • 21


    Citations of this article.


Most coding strategies that address the problem of questionable alignment (elision, case sensitive, missing, polymorphic, gaps as presence/absence matrix) conflict with phylogenetic principles, particularly those relating to the concept of homology (shared similiarity explained by common ancestry). In some cases, the test of conjunction is failed. In other cases, characters that are coded ambiguously can lead to character-state optimization in the terminal taxa that conflicts with the original observations. Only data exclusion and contraction avoid these pitfalls. In highly dissimilar sequences additional character states can represent the available information. Two new methods that accomplish this-block and stretch coding-are introduced here. These two new coding strategies are not in conflict with the test of conjunction and do not contradict the original observations. They are comparable to coding practices with morphological data once the intrinsic differences due to character-state identity and topographical identity have been taken into account. It is suggested that, of the three recoding methods, the one is selected that preserves the maximum potential phylogenetic information as measured with the minimum number of steps required for the particular part of the data matrix.

Author-supplied keywords

  • Block coding
  • Character coding
  • DNA sequence alignment
  • Homology
  • Stretch coding
  • Test of conjunction

Get free article suggestions today

Mendeley saves you time finding and organizing research

Sign up here
Already have an account ?Sign in

Find this document


  • Daniel L. Geiger

Cite this document

Choose a citation style from the tabs below

Save time finding and organizing research with Mendeley

Sign up for free