Sensitivity analysis to configuration option settings in a selection of species distribution modelling algorithms

5Citations
Citations of this article
46Readers
Mendeley users who have this article in their library.

Abstract

In pursuit of a more robust provenance in the field of species distribution modelling, an extensive literature search was undertaken to find the typical default values, and the range of values, for configuration settings of a number of the most commonly used statistical algorithms available for constructing species distribution models (SDM), as implemented in the R script packages (such as Dismo and Biomod2) or other species distribution modelling programs like Maxent. We found that documentation of SDM algorithm configuration option settings in the SDM literature is very uncommon, and the justifications for these settings were minimal, when present. Such settings were often the R default values, or were the result of trial and error. This is potentially concerning for a number of reasons; it detracts from the robustness of the provenance for such SDM studies; a lack of documentation of configuration option settings in a paper prevents the replication of an experiment, which contravenes one of the main tenets of the scientific method. Inappropriate or uninformed configuration option settings are particularly concerning if they represent a poorly understood ecological variable or process, and if the algorithm is sensitive to such settings; this could result in erroneous and/or unrealistic SDMs. We test the sensitivity of two commonly used SDM algorithms to variation in configuration options settings: Random Forests and Boosted Regression Trees. A process of expert elicitation was used to derive a range of appropriate values with which to test the sensitivity of our algorithms. We chose to use species occurrence records for the Koala (Phascolartos cinereus) for our sensitivity tests, since the species has a well known distribution. Results were assessed by comparing the geospatial distribution from each sensitivity test (i.e. altered-settings) SDM for differences compared to the control SDM (i.e. default settings), using geographical information systems (QGIS). In addition, two performance measures were used to compare differences among the altered-setting SDMs to the control. The aim of our study was to be able to draw conclusions as to how reliable reported SDM results may be in light of the sensitivity of their algorithms to certain settings, given the often arbitrary nature of such settings, and the lack of awareness of, and/or attendance to this issue in most of the published SDM literature. Our results indicate that all two algorithms tested showed sensitivity to alternate values for some of their settings. Therefore this study has showed that the choice of configuration option settings in Random Forests and Boosted Regression Trees has an impact on the results, and that assigning suitable values for these settings is a relevant consideration and as such should be always published along with the model.

References Powered by Scopus

Random forests

94714Citations
N/AReaders
Get full text

Very high resolution interpolated climate surfaces for global land areas

16685Citations
N/AReaders
Get full text

Maximum entropy modeling of species geographic distributions

13737Citations
N/AReaders
Get full text

Cited by Powered by Scopus

Species distribution models can be highly sensitive to algorithm configuration

72Citations
N/AReaders
Get full text

Predicting non-native seaweeds global distributions: The importance of tuning individual algorithms in ensembles to obtain biologically meaningful results

5Citations
N/AReaders
Get full text

Issues Related to Modelling and Parameter Settings of Models for Ecological Systems the Case of Distribution of Thorny Devil

2Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Hallgren, W., Santana, F., Low-Choy, S., Rehn, J. H. K., & Mackey, B. (2017). Sensitivity analysis to configuration option settings in a selection of species distribution modelling algorithms. In Proceedings - 22nd International Congress on Modelling and Simulation, MODSIM 2017 (pp. 50–56). Modelling and Simulation Society of Australia and New Zealand Inc. (MSSANZ). https://doi.org/10.36334/modsim.2017.a1.hallgren

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 7

50%

Researcher 4

29%

Professor / Associate Prof. 3

21%

Readers' Discipline

Tooltip

Environmental Science 10

63%

Agricultural and Biological Sciences 3

19%

Earth and Planetary Sciences 2

13%

Medicine and Dentistry 1

6%

Save time finding and organizing research with Mendeley

Sign up for free