Abstract
Generalized linear models (GLMs) are used when the variance is not constant, and when the errors are not normally distributed. Some ecological and entomological response variables invariably suffer from these two standard assumptions, and GLMs are excellent at dealing with them. Three distribution families of GLM: (1) Linear, (2) Poisson and (3) Gamma, were fitted to the null, reduced and full models with the log link function. The data used was derived from a study on the cabbage flea beetle (Psylliodes chrysocephala L.). According to the residual deviance (Goodness of Fit) and Akaike information criterion (AIC) as an estimator of model quality, it was confirmed that Gamma GLM is the best fit for the data set. Both the AIC and deviance were low in the Gamma model, while high values were noted for Poisson and Linear GLMs. Our study confirms that severe skewness often exists in data sets pertaining to parasitology and entomology. The Gamma distribution provided a better and more robust alternative estimator than Poisson and Linear models. Poisson distribution is mostly used to model the count of events occurring within a given time interval. Poisson and linear GLMs did not fit well with the data set, which was evident by their high scaled deviance (G 2). Introduction Non-normal data that deviates from the normal distribution is frequently observed by field entomologists, biologists and ecologists. Although Analysis of Variance (ANOVA) have been widely used in data analysis, the abundance and incidence data often violate the assumptions of ANOVA [1]. Most data pertaining to insect or weed abundance do not meet the assumptions of normality and homogeneity of variance [1-4]. Therefore appropriate analytical tools are at this moment needed to analyse data that is not normally distributed and non-linear. The generalized Linear Model (GLM) thus offers an alternative to address such skewness since it provides a unified application to other common statistical procedures [5]. The traditional linear model assumes that errors have normal distributions [6]. GLMs as a class of statistical models provide an abstract and simplified representations of the real data. They are called GLM since they generalize the classical linear models based on the normal distributions [5]. In addition to the linear regression component, GLMs include a special exponential family which transforms the mean via a "link function" and links the regression part to the mean of one of these distributions [6, 7]. GLM is a combination of systematic and random components of a linear model which has three (3) characteristics: (i) a dependent variable z whose distribution with parameter θ as one of the class, (ii) a set of independent variables x 1 , ..., x m and predicted systematic component Y=∑=β(i X i) , and (iii) the linking function θ=f(Y) connecting the parameter θ of the distribution of z with the Y's of the linear model [6]. Different statistical criteria are used for assessing a model quality and best fit. The Analysis of Deviance involving the residual deviance is a good test to determine the Goodness of Fit of a model [6, 8, 9]. On the other hand, the Akaike information criterion (AIC) estimates the quality of models in the class of both linear and Generalized Linear Model (GLM) [10]. Both AIC or the Bayesian information criterion (BIC) are used for model comparisons [11]. Burnham and Anderson [12] emphasised the use of AIC and BIC for comparing statistical models and transformations. Sileshi [1] and Yabeja [13] used the Poisson regression model to analyse the population of psyllid and whitefly population.
Cite
CITATION STYLE
Iamba, K. (2022). Data analysis of flea beetle (Psylliodes chrysocephala L.): comparing three (3) distribution families of Generalized Linear Model. Journal of Entomology and Zoology Studies, 10(1), 388–394. https://doi.org/10.22271/j.ento.2022.v10.i1e.8958
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.