Background: Low-abundance mutations in mitochondrial populations (mutations with minor allele frequency≤1%), are associated with cancer, aging, and neurodegenerative disorders. While recent progress in high-throughput sequencing technology has significantly improved the heteroplasmy identification process, the ability of this technology to detect low-abundance mutations can be affected by the presence of similar sequences originating from nuclear DNA (nDNA). To determine to what extent nDNA can cause false positive low-abundance heteroplasmy calls, we have identified mitochondrial locations of all subsequences that are common or similar (one mismatch allowed) between nDNA and mitochondrial DNA (mtDNA). Results: Performed analysis revealed up to a 25-fold variation in the lengths of longest common and longest similar (one mismatch allowed) subsequences across the mitochondrial genome. The size of the longest subsequences shared between nDNA and mtDNA in several regions of the mitochondrial genome were found to be as low as 11 bases, which not only allows using these regions to design new, very specific PCR primers, but also supports the hypothesis of the non-random introduction of mtDNA into the human nuclear DNA. Conclusion: Analysis of the mitochondrial locations of the subsequences shared between nDNA and mtDNA suggested that even very short (36 bases) single-end sequencing reads can be used to identify low-abundance variation in 20.4% of the mitochondrial genome. For longer (76 and 150 bases) reads, the proportion of the mitochondrial genome where nDNA presence will not interfere found to be 44.5 and 67.9%, when low-abundance mutations at 100% of locations can be identified using 417 bases long single reads. This observation suggests that the analysis of low-abundance variations in mitochondria population can be extended to a variety of large data collections such as NCBI Sequence Read Archive, European Nucleotide Archive, The Cancer Genome Atlas, and International Cancer Genome Consortium.
Albayrak, L., Khanipov, K., Pimenova, M., Golovko, G., Rojas, M., Pavlidis, I., … Fofanov, Y. (2016). The ability of human nuclear DNA to cause false positive low-abundance heteroplasmy calls varies across the mitochondrial genome. BMC Genomics, 17(1). https://doi.org/10.1186/s12864-016-3375-x