Abstract
Two data mining methods – a random forest and boosted regression trees – were used to model values of roadside air pollution depending on meteorological conditions and traffic flow, using the example of data obtained in the city of Wrocław in the years 2015–2016. Eight explanatory variables – five continuous and three categorical – were considered in the models. A comparison was made of the quality of the fit of the models to empirical data. Commonly used goodness-of-fit measures did not imply a significant preference for either of the methods. Residual analysis was also performed; this showed boosted regression trees to be a more effective method for predicting typical values in the modelling of NO 2 , NO x and PM 2.5 , while the random forest method leads to smaller errors when predicting peaks.
Cite
CITATION STYLE
Kamińska, J. A. (2018). Residuals in the modelling of pollution concentration depending on meteorological conditions and traffic flow, employing decision trees. ITM Web of Conferences, 23, 00016. https://doi.org/10.1051/itmconf/20182300016
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.