Modeling the Amplification of Immunoglobulins through Machine Learning on Sequence-Specific Features

6Citations
Citations of this article
29Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Successful primer design for polymerase chain reaction (PCR) hinges on the ability to identify primers that efficiently amplify template sequences. Here, we generated a novel Taq PCR data set that reports the amplification status for pairs of primers and templates from a reference set of 47 immunoglobulin heavy chain variable sequences and 20 primers. Using logistic regression, we developed TMM, a model for predicting whether a primer amplifies a template given their nucleotide sequences. The model suggests that the free energy of annealing, ΔG, is the key driver of amplification (p = 7.35e-12) and that 3′ mismatches should be considered in dependence on ΔG and the mismatch closest to the 3′ terminus (p = 1.67e-05). We validated TMM by comparing its estimates with those from the thermodynamic model of DECIPHER (DE) and a model based solely on the free energy of annealing (FE). TMM outperformed the other approaches in terms of the area under the receiver operating characteristic curve (TMM: 0.953, FE: 0.941, DE: 0.896). TMM can improve primer design and is freely available via openPrimeR (http://openPrimeR.mpi-inf.mpg.de).

Cite

CITATION STYLE

APA

Döring, M., Kreer, C., Lehnen, N., Klein, F., & Pfeifer, N. (2019). Modeling the Amplification of Immunoglobulins through Machine Learning on Sequence-Specific Features. Scientific Reports, 9(1). https://doi.org/10.1038/s41598-019-47173-w

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free