Stratified sampling using cluster analysis

0Citations
Citations of this article
18Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Stratified sampling is one of the probability sampling that divides the population into groups called strata. The main purpose of stratification is to reduce the variance between strata. The problem is how to make the variance between strata provides the minimum value and is the available technique for grouping the strata gives the minimum variance between strata is another problem that can be consider. To answer those problems, we used non-hierarchical or known as K-mean cluster approach to obtain a minimum variance between strata. K-mean cluster analysis is the most popular data clustering. Cluster analysis is a multivariate method to group the sample into different groups in which each group contains the same characteristic. In K-mean cluster analysis, the number of groups, k is determines by researcher. For the essential study, this paper considers k = 3 and 5 only and variuos sample sizes, n = 20 (small sample size), 50 (medium) and 100 (large). A simulation study is done to mimic the data that contains the number of variables, number of clusters and sample sizes similar to the real data. Then the simulated data validated by the real datset. The result shows that K-mean cluster analysis gave the smallest value of variance as compared to the available technique for every values of number of group, k and sample sizes, n.

Cite

CITATION STYLE

APA

Haron, N. H. B. (2022). Stratified sampling using cluster analysis. In AIP Conference Proceedings (Vol. 2472). American Institute of Physics Inc. https://doi.org/10.1063/5.0092740

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free