Hit series selection in noisy HTS data: clustering techniques, statistical tests and data visualisations

Christoph Müller; Daniel Ormsby; Isabella Feierberg; Ola Engkvist; Christian Tyrchan; Michael J Hartshorn

Journal ArticleOPEN ACCESS

Hit series selection in noisy HTS data: clustering techniques, statistical tests and data visualisations

Müller C
Ormsby D
Feierberg I
et al.

Journal of Cheminformatics (2014) 6(S1)

DOI: 10.1186/1758-2946-6-s1-p27

N/ACitations

6Readers

Abstract

High throughput screening (HTS) is one of the most prominent techniques used in the beginning stages of a drug discovery programme to identify those few hit compounds that can be used as starting points in subse-quent studies [1,2]. However, an HTS experiment often entails a very data-intensive and challenging hit prioriti-zation process that yields the mentioned hit compounds. The workflow described in this study aims to make this decision-making process easier by combining the struc-tural and biological information of compounds used in an HTS. In particular, the workflow combines various clustering and nearest neighbourhood schemes with a non-parametric statistical test in order to prioritize those groupings of compounds that are likely of being relevant to the biological target of interest [3]. The novel workflow was evaluated under various aspects in a retrospective study using publicly available quantita-tive HTS (qHTS) datasets [4]. One of the main bench-marking aspects in this study was the ability to correctly identify as many true active compounds as possible. Therefore different chemical descriptors and clustering schemes were tested in combination with the statistic to measure their classification performance. The workflow was integrated into Dotmatics' Vortex, a platform for analysing chemical information using che-moinformatics methods and data visualisations tools [5]. This integration enables researchers to easily extend their current HTS workflow in order to discover new hit series and reveal hidden relationships between compounds, scaffolds and clusters.

Cite

CITATION STYLE

APA

Müller, C., Ormsby, D., Feierberg, I., Engkvist, O., Tyrchan, C., & Hartshorn, M. J. (2014). Hit series selection in noisy HTS data: clustering techniques, statistical tests and data visualisations. Journal of Cheminformatics, 6(S1). https://doi.org/10.1186/1758-2946-6-s1-p27

Hit series selection in noisy HTS data: clustering techniques, statistical tests and data visualisations

Abstract

Cite

Register to see more suggestions