The Architecture, Engineering and Construction (AEC) sector has a lower adoption rate of machine learning (ML) tools than other industries with similar characteristics. A significant contributing factor to this lower adoption rate is the limited availability of data, as ML techniques rely on large datasets to train algorithms effectively. However, the construction process generates substantial data that provide detailed characterisation of a project. In this regard, this paper presents a data-scraping algorithm to search construction procurement repositories systematically to develop an ML-ready dataset for training data for ML and natural language processing (NLP) algorithms focused on construction’s procurement phase. This tool automatically scrapes procurement repositories, developing a procurement file dataset comprisffing bills of quantities (BoQs) and project specifications.
CITATION STYLE
Jacques de Sousa, L., Poças Martins, J., & Sanhudo, L. (2023). Tackling the Data Sourcing Problem in Construction Procurement Using File-Scraping Algorithms. Engineering Proceedings, 53(1). https://doi.org/10.3390/IOCBD2023-15190
Mendeley helps you to discover research relevant for your work.