Sampling method and sample size can alter the performance of species distribution models (SDMs). In this study, we identified an effective sampling method and sample size for modeling Korean red pine (Pinus densiflora). We used 3 sampling methods (simple random sampling, stratified sampling, and area–weighted sampling), 7 different sample sizes (30, 50, 100, 200, 500, 1000, and 3000), and 8 SDMs (GLM, GAM, CTA, ANN, GBM, RF, FDA, and MAXENT). The performance of each model was evaluated using the area under the receiver operating characteristic curve. Differences among the models were validated using ANOVA. We found that the area–weighted sampling method was the most effective and stable. As sample size increased, model performance increased in the random and stratified sampling methods. However, performance became saturated as sample size exceeded 200 in the area–weighted sample due to spatial autocorrelation among samples. All models exhibited different levels of performance. The RF and GBM models exhibited the highest performance (AUC = 0.838 and 0.839, respectively), while the ANN model exhibited the lowest performance (AUC = 0.658). Therefore, sampling method and sample size should be carefully considered when selecting SDMs depending on the objective of the study.
CITATION STYLE
Sung, S. Y., Lee, D. K., Park, C., Kim, H. G., Kil, S. H., Chae, H. M., … Ohga, S. (2018). Assessing Effective Sampling Method and Sample Size for Species Distribution Modeling of Korean Red Pine (Pinus densiflora). Journal of the Faculty of Agriculture, Kyushu University, 63(2), 211–221. https://doi.org/10.5109/1955384
Mendeley helps you to discover research relevant for your work.