Today, data mining and gene expressions are at the forefront of modern data analysis. Here we introduce a novel probability distribution that is applicable in these fields. This paper develops the proposed spherical-Dirichlet distribution designed to fit vectors located at the positive orthant of the hypersphere, as it is often the case for data in these fields, avoiding unnecessary probability mass. Basic properties of the proposed distribution, including normalizing constants and moments are developed. Relationships with other distributions are also explored. Estimators based on classical inferential statistics, such as method of moments and maximum likelihood estimators are obtained. Two applications are developed: the first one uses simulated data, and the second uses a real text mining example. Both examples are fitted using the proposed spherical-Dirichlet distribution and their results are discussed.
Mendeley helps you to discover research relevant for your work.