Automatically extracted data is rarely “clean” with respect to pragmatic (real-world) constraints—which thus hinders applications that depend on quality data. We proffer a solution to detecting pragmatic constraint violations that works via a declarative and semantically enabled constraint-violation checker. In conjunction with an ensemble of automated information extractors, the implemented prototype checks both hard and soft constraints—respectively those that are satisfied or not and those that are satisfied probabilistically with respect to a threshold. An experimental evaluation shows that the constraint checker identifies semantic errors with high precision and recall and that pragmatic error identification can improve results.
CITATION STYLE
Woodfield, S. N., Lonsdale, D. W., Liddle, S. W., Kim, T. W., Embley, D. W., & Almquist, C. (2016). Pragmatic quality assessment for automatically extracted data. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9974 LNCS, pp. 212–220). Springer Verlag. https://doi.org/10.1007/978-3-319-46397-1_16
Mendeley helps you to discover research relevant for your work.