Optimising machine learning prediction of minimum inhibitory concentrations in Klebsiella pneumoniae

2Citations
Citations of this article
16Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Minimum Inhibitory Concentrations (MICs) are the gold standard for quantitatively measuring antibiotic resistance. However, lab-based MIC determination can be time-consuming and suffers from low reproducibility, and interpretation as sensitive or resistant relies on guidelines which change over time. Genome sequencing and machine learning promise to allow in silico MIC prediction as an alternative approach which overcomes some of these difficulties, albeit the interpretation of MIC is still needed. Nevertheless, precisely how we should handle MIC data when dealing with predictive models remains unclear, since they are measured semi-quantitatively, with varying resolution, and are typically also left-and right-censored within varying ranges. We therefore investigated genome-based prediction of MICs in the pathogen Klebsiella pneumoniae using 4367 genomes with both simulated semi-quantitative traits and real MICs. As we were focused on clinical interpretation, we used interpretable rather than black-box machine learning models, namely, Elastic Net, Random Forests, and linear mixed models. Simulated traits were generated accounting for oligogenic, polygenic, and homoplastic genetic effects with different levels of heritability. Then we assessed how model prediction accuracy was affected when MICs were framed as regression and classification. Our results showed that treating the MICs differently depending on the number of concentration levels of antibiotic available was the most promising learning strategy. Specifically, to optimise both prediction accuracy and inference of the correct causal vari-ants, we recommend considering the MICs as continuous and framing the learning problem as a regression when the number of observed antibiotic concentration levels is large, whereas with a smaller number of concentration levels they should be treated as a categorical variable and the learning problem should be framed as a classification. Our findings also underline how predictive models can be improved when prior biological knowledge is taken into account, due to the varying genetic architecture of each antibiotic resistance trait. Finally, we emphasise that incrementing the population database is pivotal for the future clinical implementation of these models to support routine machine-learning based diagnostics.

References Powered by Scopus

IQ-TREE: A fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies

16986Citations
N/AReaders
Get full text

Regularization and variable selection via the elastic net

13099Citations
N/AReaders
Get full text

Regularization paths for generalized linear models via coordinate descent

12222Citations
N/AReaders
Get full text

Cited by Powered by Scopus

Innovative approaches in phenotypic beta-lactamase detection for personalised infection management

2Citations
N/AReaders
Get full text

Using genomic data and machine learning to predict antibiotic resistance: A tutorial paper

0Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Biffignandi, G. B., Chindelevitch, L., Corbella, M., Feil, E. J., Sassera, D., & Lees, J. A. (2024). Optimising machine learning prediction of minimum inhibitory concentrations in Klebsiella pneumoniae. Microbial Genomics, 10(3). https://doi.org/10.1099/mgen.0.001222

Readers' Seniority

Tooltip

Researcher 3

60%

Lecturer / Post doc 1

20%

PhD / Post grad / Masters / Doc 1

20%

Readers' Discipline

Tooltip

Biochemistry, Genetics and Molecular Bi... 3

60%

Agricultural and Biological Sciences 2

40%

Save time finding and organizing research with Mendeley

Sign up for free