Out-of-sample extrapolation using semi-supervised manifold learning (OSE-SSL): Content-based image retrieval for prostate histology grading

4Citations
Citations of this article
19Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In this paper, we present an out-of-sample extrapolation (OSE) scheme in the context of semi-supervised manifold learning (OSESSL). Manifold learning (ML) takes samples with high dimensionality and learns a set of low dimensional embeddings. Embeddings generated by ML preserve nonlinear relationships between samples allowing dataset visualization, classification, or evaluation of object similarity. Semi-supervised ML (SSL), a recent ML extension, exploits known class labels to learn embeddings, which may result in greater separation between samples of different classes compared to unsupervised ML schemes. Most ML schemes utilize the eigenvalue decomposition (EVD) to learn embeddings. For instance, Graph Embedding (GE) learns embeddings by EVD on a similarity matrix that models high dimensional feature vector similarity between samples. In datasets where new samples are acquired, such as a content-based image retrieval (CBIR) system, recalculating EVD is infeasible. OSE schemes obtain new embeddings without recalculating EVD. The Nystrm method (NM) is an OSE algorithm where new embeddings are estimated as a weighted sum of known embeddings. Known embeddings must describe the embedding space for NM to accurately estimate new embeddings. In this paper, NM and semi-supervised GE (SSGE) are combined to learn embeddings which cluster samples by class and rapidly calculate embeddings for new samples without recalculating EVD. OSE-SSL is compared to (i) NM paired with GE (NM-GE), and (ii) SSGE obtained for the full database, where SSGE results represent ground truth embeddings. OSE-SSL, NM-GE, and SSGE are evaluated in their ability to: (1) cluster samples by label, measured by Silhouette Index (SI); (2) CBIR accuracy, measured by area under the precision-recall curve (AUPRC). In a synthetic Swiss roll dataset of 2000 samples, OSE-SSL requires training on 50% of the dataset to achieve SI and AUPRC similar to SSGE while NM-GE requires 70% of dataset to achieve SI and AUPRC similar to GE. For a prostate histology dataset of 888 glands, a CBIR system was evaluated on its ability to retrieve images according to Gleason Grade. OSE-SSL had AUPRC of 0.6 while NM-GE had AUPRC of 0.3. © 2011 IEEE.

Cite

CITATION STYLE

APA

Sparks, R., & Madabhushi, A. (2011). Out-of-sample extrapolation using semi-supervised manifold learning (OSE-SSL): Content-based image retrieval for prostate histology grading. In Proceedings - International Symposium on Biomedical Imaging (pp. 734–737). https://doi.org/10.1109/ISBI.2011.5872510

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free