What Influences Low-cost Sensor Data Calibration? - A Systematic Assessment of Algorithms, Duration, and Predictor Selection

Lu Liang; Jacob Daniels

Journal ArticleOPEN ACCESS

What Influences Low-cost Sensor Data Calibration? - A Systematic Assessment of Algorithms, Duration, and Predictor Selection

Aerosol and Air Quality Research (2022) 22(9)

DOI: 10.4209/aaqr.220076

12Citations

29Readers

Abstract

The low-cost sensor has changed the air quality monitoring paradigm with the capacity for efficient network expansion and community engagement. The surge in its use has sparked new research interests in understanding its data quality. Many studies have employed field calibration to improve sensor agreement with co-located reference monitors. Yet, studies that systematically examine the performance of different calibration techniques are limited in scope and depth. This study comprehensively assessed ten widely used data techniques, namely AdaBoost, Bayesian ridge, gradient tree boosting, K-nearest neighbors, Lasso, multivariable linear regression, neural network, random forest, ridge regression, and support vector machine. We compared their performance using a standardized baseline dataset and their responses to various parameter combinations. We further assessed the training sample size effect to understand the optimal duration of field calibration for achieving good accuracy. Finally, we tested different predictor combinations to address whether the inclusion of more predictors will lead to better performance. Using baseline data, the neural network achieved the best performance, followed by the four regression-based methods, showing very consistent and stable performance. While confirming that the latest research tendency is deep learning, regression is still a viable option for studies with limited effort in parameter tuning and method selection, especially considering its computational efficiency and simplicity. The sample size effect is most evident when the sample size drops below 30%, which is equivalent to six weeks of continuously collected hourly data. Although algorithms react differently to the number of predictors, their performance was typically boosted by adding more predictors, especially the particle count and humidity. Our study not only describes an approach of sophisticated data-driven calibration for practical applications, but also provides insights into the compounding impacts of parameters, samples, and predictors in algorithm performance.

Author supplied keywords

References Powered by Scopus

View more at Scopus

Cited by Powered by Scopus

View more at Scopus

Cite

CITATION STYLE

APA

Liang, L., & Daniels, J. (2022). What Influences Low-cost Sensor Data Calibration? - A Systematic Assessment of Algorithms, Duration, and Predictor Selection. Aerosol and Air Quality Research, 22(9). https://doi.org/10.4209/aaqr.220076

Readers over time

Readers' Seniority

PhD / Post grad / Masters / Doc 12

71%

Researcher 3

18%

Professor / Associate Prof. 2

12%

Readers' Discipline

Engineering 8

50%

Environmental Science 4

25%

Physics and Astronomy 3

19%

Computer Science 1

What Influences Low-cost Sensor Data Calibration? - A Systematic Assessment of Algorithms, Duration, and Predictor Selection

Abstract

Author supplied keywords

References Powered by Scopus

Random forests

Support-Vector Networks

Regression Shrinkage and Selection Via the Lasso

Cited by Powered by Scopus

Integrating low-cost sensor monitoring, satellite mapping, and geospatial artificial intelligence for intra-urban air pollution predictions

Reliability Testing of a Low-Cost, Multi-Purpose Arduino-Based Data Logger Deployed in Several Applications Such as Outdoor Air Quality, Human Activity, Motion, and Exhaust Gas Monitoring

Long-term evaluation of commercial air quality sensors: an overview from the QUANT (Quantification of Utility of Atmospheric Network Technologies) study

Register to see more suggestions

Cite

Readers over time

Readers' Seniority

Readers' Discipline