Using a low-cost sensor array and machine learning techniques to detect complex pollutant mixtures and identify likely sources

30Citations
Citations of this article
29Readers
Mendeley users who have this article in their library.

Abstract

An array of low-cost sensors was assembled and tested in a chamber environment wherein several pollutant mixtures were generated. The four classes of sources that were simulated were mobile emissions, biomass burning, natural gas emissions, and gasoline vapors. A two-step regression and classification method was developed and applied to the sensor data from this array. We first applied regression models to estimate the concentrations of several compounds and then classification models trained to use those estimates to identify the presence of each of those sources. The regression models that were used included forms of multiple linear regression, random forests, Gaussian process regression, and neural networks. The regression models with human-interpretable outputs were investigated to understand the utility of each sensor signal. The classification models that were trained included logistic regression, random forests, support vector machines, and neural networks. The best combination of models was determined by maximizing the F1 score on ten-fold cross-validation data. The highest F1 score, as calculated on testing data, was 0.72 and was produced by the combination of a multiple linear regression model utilizing the full array of sensors and a random forest classification model.

Cite

CITATION STYLE

APA

Thorson, J., Collier-Oxandale, A., & Hannigan, M. (2019). Using a low-cost sensor array and machine learning techniques to detect complex pollutant mixtures and identify likely sources. Sensors (Switzerland), 19(17). https://doi.org/10.3390/s19173723

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free