A Bayesian approach to combining ...
515 Animal Biodiversity and Conservation 27.1 (2004) �� 2004 Museu de Ci��ncies Naturals ISSN: 1578���665X Brooks, S. P., King, R. & Morgan, B. J. T., 2004. A Bayesian approach to combining animal abundance and demographic data. Animal Biodiversity and Conservation, 27.1: 515���529. Abstract A Bayesian approach to combining animal abundance and demographic data.��� In studies of wild animals, one frequently encounters both count and mark���recapture���recovery data. Here, we consider an integrated Bayesian analysis of ring���recovery and count data using a state���space model. We then impose a Leslie��� matrix���based model on the true population counts describing the natural birth���death and age transition processes. We focus upon the analysis of both count and recovery data collected on British lapwings (Vanellus vanellus) combined with records of the number of frost days each winter. We demonstrate how the combined analysis of these data provides a more robust inferential framework and discuss how the Bayesian approach using MCMC allows us to remove the potentially restrictive normality assumptions commonly assumed for analyses of this sort. It is shown how WinBUGS may be used to perform the Bayesian analysis. WinBUGS code is provided and its performance is critically discussed. Key words: Census data, Integrated analysis, Kalman filter, Logistic regression, Ring���recovery data, State��� space model, WinBUGS. Resumen Aproximaci��n bayesiana para combinar abundancia y datos demogr��ficos.��� En estudios de animales salvajes, es frecuente encontrarse tanto con datos de recuento como datos de marcaje���recaptura��� recuperaci��n. En el presente estudio consideramos un an��lisis integrado bayesiano de recuperaci��n de anillas y datos de recuento utilizando un modelo de estado���espacio. Seguidamente aplicamos un modelo basado en las matrices de Leslie en los recuentos de poblaci��n verdadera para describir los procesos naturales de nacimiento���muerte y de transici��n de edades. Nos centramos en el an��lisis de los datos de recuento y de recuperaci��n recopilados en avefr��as europeas (Vanellus vanellus) en combinaci��n con los registros del n��mero de d��as de helada de cada invierno. Demostramos c��mo el an��lisis combinado de estos datos proporciona un marco inferencial m��s s��lido, y discutimos c��mo el enfoque bayesiano usando MCMC nos permite eliminar los supuestos de normalidad potencialmente restrictivos que suelen adoptarse en an��lisis de este tipo. Se demuestra c��mo puede utilizarse WinBUGS para realizar el an��lisis bayesiano. Se facilita el c��digo WinBUGS, y se discute su funcionamiento. Palabras clave: Datos de censo, An��lisis integrado, Filtro de Kalman, Regresi��n log��stica, Datos de recuperaci��n de anillas, Modelo de estado���espacio, WinBUGS. S. P. Brooks, The Statistical Laboratory���CMS, Univ. of Cambridge, Wilberforce Road, Cambridge CB3 0WB, U.K.��� R. King, CREEM, Univ. of St. Andrews, Buchanan Gardens, St. Andrews KY16 9LZ, U.K.��� B. J. T. Morgan, Inst. of Matemathics, Statistics and Actuarial Science, Univ. of Kent, Canterbury CT2 7NF, U.K. A Bayesian approach to combining animal abundance and demographic data S. P. Brooks, R. King & B. J. T. Morgan
516 Brooks et al. in the case of lapwings, the index data do not provide a formal census of a national population, but may be regarded as estimating the population size for the set of sites at which observations take place. We also introduce the number of days each year that a Central England temperature fell below zero, as a covariate to help describe the variation in survival over time. We begin with a description of the population index data and associated model. Population index data The population index data are derived from the Common Birds Census (CBC) which has been the main source of information on population levels for common British birds since it was es- tablished in 1962. More recently it has been replaced by the Breeding Bird Survey. Annual counts are made at a number of sites around the UK and from these an index value is calculated based upon a statistical analysis of the data collected (Ter Braak et al., 1994). The raw data are not used, but the index provides a measure of the population level, taking account of the fact that each year only a small proportion of sites are actually surveyed. We shall consider the analysis of single���site data in Section 4. Here, we analyse the index values collected for adult females from 1965 to 1998 inclusive and we omit data from earlier years of the study during which the index protocol was being standardised. The data are plotted in (Besbeas et al., 2002). We denote the index value for year t by yt and, for consistency with the ring���recovery data described later, we associate the year 1963 with the value t = 1, so that we actually observe the values y3,...,y36. Since these yt are only estimates, we first try to estimate the true underlying population levels that we will subsequently use as input into our system model. Here we shall assume that yt - N ( Na,t, "y2) (1) where Na,t represents the true underlying numbers of adult females aged $ $ $ $ $ 2 years at time t. Here "y2 is taken as a constant variance, although other assumptions could also be made. For an index, a constant variance seems reasonable we do not have access to the estimated standard errors re- sulting from the separate statistical analysis that has resulted in the population index. Note that we estimate "y2 from the index data, and not from the raw survey data. This then describes the observa- tion process by which the estimates yt are derived from the true underlying process Na,t. We next need to describe the underlying system process which provides a model for the evolution of the true underlying population size over time. We follow the notation of Besbeas et al. (2002), rather than Durbin & Koopman (1997). A natural model would be to assume that Na,t - Bin (N1, t���1 + Na,t���1, &a,t���1) Introduction Studies of wildlife populations often result in different forms of data being collected from different sources. Useful data comprise capture���recapture data (of live animals), ring���recovery data (of dead animals), ra- dio���tagging (where the state of each animal is known at all times), data on productivity (as in nest���record data, for example), location data and/or count data (estimates of total population size). By combining data from different sources, we obtain more robust (and self���consistent) parameter estimates that fully reflect the information available. Previous studies of combined data of this sort include the analysis of joint capture���recapture and ring���recovery data (Catchpole et al., 1998 King & Brooks, 2002b), multi���site data (King & Brooks, 2002a King & Brooks, 2003) and joint ring���recovery and either census data or population indices (Besbeas et al., 2002). In this paper, we shall consider a Bayesian analy- sis of joint ring���recovery and population index data, revisiting the analysis of Besbeas et al., 2002. We demonstrate how the state space model used to describe the index data can be easily fitted using Markov chain Monte Carlo (MCMC Gamerman, 1995 Gilks et al., 1996 Brooks, 1998) and implemented via WinBUGS (Spiegelhalter et al., 2002b Gentleman, 1997 Link et al., 2002). An appendix provides code to analyse the dataset described here. MCMC methods provide an alternative to the Kalman filter based approaches typically applied to problems of this sort. They also permit more general modelling frameworks for cases where the usual normality and linearity assumptions are not appropriate. We begin in "Data and modelling" section with an introduction to the data and of the models we will use here. In "Analysis and results" section we describe the Bayesian analysis of these data using WinBUGS and provide estimates for key param- eters of interest. In "Non���mortality" section we provide an example where the Bayesian analysis is more appropriate due to the small count values. Finally, in "Discussion" section we discuss the use of WinBUGS, both for the application of this paper, and more generally. Data and modelling The British lapwing (Vanellus vanellus) population has been declining over recent years and has been placed on the "amber" list of species of conserva- tion concern in Britain. As such, it has received a great deal of attention over recent years (Tucker et al., 1994) not least because it can be regarded as an "indicator" species in that by understanding the reasons for its decline, we might gain insight into the dynamics of similar farmland birds. We have two distinct sources of data, both of which are provided by the British Trust for Ornithology (BTO): index data providing annual population size esti- mates and recovery data from birds ringed as chicks and subsequently reported dead. Note that
Animal Biodiversity and Conservation 27.1 (2004) 517 where N1,t denotes the number of females of age 1 in year t and &a,t denotes the adult survival rate in year t. We note here that lapwings are considered adult after year 1 of life. Thus, the number of adults aged $ 2 years in year t is derived directly from the number of adults and birds in their first year of life in the previous year which survive from t ��� 1 to t. In a similar manner, we might model the number of 1��� year old females in year t by N1,t - Po (Na,t���1 !t���1 &1,t���1) where &1,t denotes the first���year survival rate in year t and !t denotes the productivity rate in year t i.e., the average number of female offspring per adult fe- male. We therefore assume that breeding begins at age 2. Thus, the number of birds aged 1 in year t stems directly from the number of chicks produced the year before which then survive from t ��� 1 to t. Traditionally, this model is difficult to fit classi- cally as it falls beyond the standard normal frame- work (Durbin & Koopman, 1997). Thus, we adopt instead the common normal approximation in which we take (2) where the 1,t and a,t are assumed to be independ- ent and Normally distributed, each with mean zero and variance "21,t and "2a,t, respectively. To approxi- mate the Poisson/Binomial model above, we take "21,t = Na,t���1 !t���1 &1,t���1 "2a,t = (N1,t���1 + Na,t���1) &a,t���1(1 ��� &a,t���1) See Sullivan (1992), Newman (1998) and Besbeas et al. (2002) for example. It is worth noting here that though the model depends upon the survival rates, there is typically very little information in the data with which to estimate them. In order to provide additional infor- mation, we can combine these data with those from a recovery study which provides far greater infor- mation on the survival rates. Recovery data To augment the index data, we also have recovery data from lapwings ringed as chicks between 1963 and 1997 and later found dead and reported be- tween 1964 and 1998. Adult birds were also ringed as part of the study, but they make up a very small proportion of the total dataset and are ignored. The data are reproduced in Besbeas et al. (2001). Here, we denote the observed recovery data by , t1 = 1,...,35, t2 = t1 + 1,...,37, where denotes the number of animals released at the beginning of year t1 and subsequently recovered (dead) in the year up to the end of year t2 for t2 [ 36 and denotes the number of animals ringed in year t1 and never subsequently returned. We then assume that for each t1, the values , t2 = t1 + 1,...,37 follow a multinomial distribution with proportions which denote the probability that a chick ringed in year t1 is subsequently returned in year t2. Here we shall assume, as for the index data, that adults and first years have different time��� varying survival rates, but common time���varying recovery rate t denoting the probability that a bird which dies in year t is recovered. See Besbeas et al. (2002) for further details and assumptions underpinning this model. Under this model, we have that: t2 = t1 + 2,...,36 (3) and . Throughout this paper, we follow the convention that a null sequence has sum 0 and product 1. Thus, in the formula (3) for , the product term is 1 when t2 = t1 + 2. Incorporating covariates As well as the index and recovery data, we also have a variety of weather covariates that we can use to try to explain the variation of our model parameters over time. Of particular relevance are the number of frost days (i.e., the number of days during which a Central England temperature went below freezing) each year. For year t, ft denotes the number of days below freezing between April of year t and March of year (t + 1), inclusive. This covariate was used by Besbeas et al., 2002. The survival probability of wild birds is likely to be more affected by lengthy cold periods rather than by low average temperatures, which might result from rela- tively short cold spells. Thus, we take logit &1,t = (4) and logit &a,t = (5) We expect to encounter a decline over time of the reporting probability of dead animals (see e.g., Baillie & Green, 1987) and in addition we are interested in seeing whether productivity might change over time. It should be noted that we base our models on the model selected by Besbeas et al. (2002), and do not carry out a model compari- son. That will be the subject of further work (King et al., 2004). Thus, here, we set logit t = + t (6) and, since productivity is constrained only to be positive and not simply to values in [0,1], we have log !t = ! + ! t (7) Thus the model components (index, recovery and covariates) can then be combined to provide a single comprehensive analysis.