Nonparametric clustering of functional data

Forrest Miller; James Neill; Haiyan Wang

Journal ArticleOPEN ACCESS

Nonparametric clustering of functional data

Miller F
Neill J
Wang H

Statistics and Its Interface (2008) 1(1) 47-62

DOI: 10.4310/sii.2008.v1.n1.a5

N/ACitations

24Readers

Abstract

This paper presents a method for effectively detecting unknown patterns or clusters in high dimensional functional data. Examples of such data include gene expression levels measured over time from microarray experiments, functional magnetic resonance imaging (fMRI), mass spectrom-etry data from proteinomics, lipidomics etc. We define clusters through the unknown high dimensional multivariate distributions of all observations along each curve. Kullback-Leibler information and Mahalanobis generalized squared distance can fail to provide meaningful measure of distance between distributions in such high dimensional setting. We propose a new similarity measure and an agglomerative clustering algorithm, called PCLUST, to effectively differentiate among high dimensional populations. The algorithm produces invariant results under monotone transformations of data and does not require users to specify the number of clusters. Simulations show that PCLUST significantly out-performs 9 other popular algorithms in both clustering accuracy and robustness. An application in identifying biomark-ers using time course gene expression data from Arabidopsis in response to environmental stresses is illustrated.

Cite

CITATION STYLE

APA

Miller, F., Neill, J., & Wang, H. (2008). Nonparametric clustering of functional data. Statistics and Its Interface, 1(1), 47–62. https://doi.org/10.4310/sii.2008.v1.n1.a5

Nonparametric clustering of functional data

Abstract

Cite

Register to see more suggestions