Sign up & Download
Sign in

Machine learning and the problem of prediction and explanation in ecological modelling

by David Stockwell
()

Abstract

Abstract Machine learning is a field of artificial intelligence research where computer programs simulate the acquisition and application of human knowledge. This thesis applies machine learning to model development, where model development is defined as the formation of symbolic representations of reality given data or examples. The prediction/explanation problem is defined as the problem of achieving prediction and explanation simultaneously with a model. The goal of the research was to address the prediction/explanation problem by developing a machine learning system that produced predictive and explanatory models of the response of animals to the environment.A number of machine learning systems were examined. Decision tree induction algorithms and a Bayesian classifier system gave good predictions, but the explanatory ability was low. Theoretical examination revealed that prediction and explanation are complex tasks, with a number of different components, defined by a formal learning paradigm. A range of distinct forms of prediction and forms of explanation were identified. A system for induction of rule-sets called GARP was developed and used to illustrate the character, cause and methods of controlling the prediction/explanation problem in habitat analysis. Resampling was used to determine predictive accuracy and rules for providing explanations. Finally GARP was applied to problems in wildlife management such as finding the factors that predict the distribution of wildlife, the environmental factors necessary for wildlife, and the factors controlling wildlife distribution.The results show that there are a number of fundamental difficulties in developing models for prediction and explanation, some of which contribute to reduction in both prediction and explanation, and some that reduce one or the other. However no fundamental limitations to prediction and explanation were found given the definitions of prediction and explanation used in this study. Adoption of machine learning methods and the formal learning paradigm assists in overcoming these difficulties and finding the underlying model reliably. The pragmatic and scientific benefits of this work are the development of machine learning both as a tool for automated modelling and as a practical method of inquiry into the acquisition of knowledge. Technical Summary The first manuscript is a review of some existing applications of machine learning to ecology and functions as a tutorial introduction to the field of machine learning (Manuscript 1). The second manuscript introduces the prediction/explanation problem and sets out a research strategy for understanding and finding a solution to the problem in ecological modelling (Manuscript 2). The next three papers examine two existing methods for predictive and explanatory adequacy: decision tree induction and the Bayesian classifier. Qualitative modelling of biotic response using decision trees is introduced (Manuscript 3). In a comparison of a number of modelling methods, including linear models and expert system development, decision trees were found to predict well but were poor at explaining biotic response (Manuscript 4). A new system was developed based on a Bayesian classifier (Manuscript 5). This matrix-based, probability method allows rapid development of expert systems due to incremental and integrated knowledge acquisition. However, it contained a number of limiting assumptions that could make prediction unreliable, and its potential for providing explanations from analysis of data was low.Following the failure of the previous methods, it was clear an analysis of the nature of prediction and explanation was needed. Using the Bayesian classifier in an empirical study, a number of forms of prediction were identified: e.g. resubstitution, resampling, forecasting, extrapolation and transmission (Manuscript 6). Similarly explanation was found to be a complex and difficult concept to define, although the notion of an underlying structure provides some guidance (Manuscript 7). Rules were identified as explanatory structures that support prediction. Support for rules as explanations in ecology was obtained through a survey (Manuscript 8). A rule-set induction system was developed with the potential for prediction and explanation (Manuscript 9). Issues of system design for achieving prediction and explanation were examined in an empirical evaluation of natural and artificial data sets (Manuscript 10). Finally, the explanatory ability of the system was compared with linear regression for explaining the presence and absence of waterbirds on disused quarry pits (Manuscript 11).Achieving explanation and prediction using a machine learning approach is discussed in the conclusion (Manuscript 12). The characteristics and causes of the prediction/explanation problem are discussed. While there are a number of fundamental difficulties in achieving combined prediction and explanation, no fundamental limitations to achieving this goal were discovered. These difficulties can be ameliorated through the increased computational resources, more well defined model development methodology and data driven approach to model development that machine learning methods provide. Contents - Page Section 1. Introduction and review of the problem and methods Introduction.1 Manuscript 1.Machine learning in ecological modelling.13 Manuscript 2.Prediction versus explanation? - a research program for machine learning.37 Section 2. Evaluation of existing methods Manuscript 3.Modelling of qualitative data via machine learning 53 Manuscript 4.Using induction of decision trees to predict greater glider density.59 Manuscript 5.LBS: Bayesian learning system for rapid expert system development.71 Section 3. Theoretical development of prediction and explanation Manuscript 6.Learning to predict: an empirical evaluation of the Bayesian classifier as a general predictive system.91 Manuscript 7.A structural view of scientific explanation by explanatory modelling systems in ecology.111 Manuscript 8.A survey of forms of explanation used in ecology.133 Section 4. Rule set induction for prediction and explanation Manuscript 9.Induction of sets of rules from animal distribution data: a robust and informative method of data analysis.151 Manuscript 10.The effect of bias on prediction and explanation in induced rule sets.157 Manuscript 11.Using rule sets to explain animal response to the environment.175 Section 5. Conclusions Manuscript 12.Nature and solutions to the prediction/explanation problem in ecology.193 References.207 Appendix 1Publications during doctoral research.221

Cite this document (BETA)

Readership Statistics

2 Readers on Mendeley
by Discipline
 
 
by Academic Status
 
50% Researcher (at a non-Academic Institution)
 
50% Associate Professor
by Country
 
50% Germany
 
50% Australia

Sign up today - FREE

Mendeley saves you time finding and organizing research. Learn more

  • All your research in one place
  • Add and import papers easily
  • Access it anywhere, anytime

Start using Mendeley in seconds!

Already have an account? Sign in