One of the main problems of quantitative analytical chemistry is to estimate the concentration of one or more species from the values of certain physicochemical properties of the system of interest. For this it is necessary to construct a calibration model, i.e., to determine the relationship between measured properties and concentrations. The multivariate calibration is one of the most successful combinations of statistical methods to chemical data, both in analytical chemistry and in theoretical chemistry. Among used methods can cite Artificial Neural Networks (ANN), the Nonlinear Partial Least Squares (N-PLS), Principal Components Regression (PCR) and Multiple Linear Regression (MLR). In addition of multivariate calibration methods algorithms of samples selection are used. These algorithms choose a subset of samples to be used in training set covering adequately the space of the samples. In other hand, a large spectrum of a sample is typically measured by modern scanning instruments generating hundreds of variables. Search algorithms have been used to identify variables which contribute useful information about the dependent variable in the model. This paper proposes a Genetic Algorithm based on Double Chromosome (GADC) to do these tasks simultaneously, the sample and variable selection. The obtained results were compared with the well-known algorithms for samples and variable selection Kennard-Stone, Partial Least Square and Successive Projection Algorithm. We showed that the proposed algorithm can obtain better calibrations models in a case study involving the determination of content protein in wheat samples.
CITATION STYLE
Santiago, K. de S., Soares, A. S., De Lima, T. W., Coelho, C. J., & Gabriel, P. H. R. (2015). Genetic algorithm for variable and samples selection in multivariate calibration problems. Journal of Computer Science, 11(4), 621–626. https://doi.org/10.3844/jcssp.2015.621.626
Mendeley helps you to discover research relevant for your work.