Strategies for large-scale genomic DNA sequencing currently require physical mapping, followed by detailed mapping, and finally sequencing. The level of mapping detail determines the amount of effort, or sequence redundancy, required to finish a project. Current strategies attempt to find a balance between mapping and sequencing efforts. One such approach is to employ strategies that use sequence data to build physical maps. Such maps alleviate the need for prior mapping and reduce the final required sequence redundancy. To this end, the utility of correlating pairs of sequence data derived from both ends of subcloned templates is well recognized. However, optimal strategies employing such pairwise data have not been established. In the present work, we simulate and analyze the parameters of pairwise sequencing projects including template length, sequence read length, and total sequence redundancy. One pairwise strategy based on sequencing both ends of plasmid subclones is recommended and illustrated with raw data simulations. We find that pairwise strategies are effective with both small (cosmid) and large (megaYAC) targets and produce ordered sequence data with a high level of mapping completeness. They are ideal for fine-scale mapping and gene finding and as initial steps for either a high- or a low-redundancy sequencing effort. Such strategies are highly automatable. © 1995 Academic Press, Inc.
Mendeley saves you time finding and organizing research
Choose a citation style from the tabs below