The United States Environmental Protection Agency (EPA) periodically releases Integrated Science Assessments (ISAs) that synthesize the latest research on each of six air pollutants to inform environmental policymaking. To guarantee the best possible coverage of relevant literature, EPA scientists spend months manually screening hundreds of thousands of references to identify a small proportion to be cited in an ISA. The challenge of extreme scale and the pursuit of maximum recall calls for effective machine-assisted approaches to reducing the time and effort required by the screening process. This work introduces the ISA literature screening dataset and the associated research challenges to the information and knowledge management community. Our pilot experiments show that combining multiple approaches in tackling this challenge is both promising and necessary. The dataset is available at https://catalog.data.gov/dataset/isa-literature-screening-dataset-v-1.
CITATION STYLE
Hou, J., Wang, X., Dubois, J. J., Rice, R. B., Haddock, A., & Wang, Y. (2022). Extreme Systematic Reviews: A Large Literature Screening Dataset to Support Environmental Policymaking. In International Conference on Information and Knowledge Management, Proceedings (pp. 4029–4033). Association for Computing Machinery. https://doi.org/10.1145/3511808.3557600
Mendeley helps you to discover research relevant for your work.