Natural Test Generation for Precise Testing of Question Answering Software

Qingchao Shen; Junjie Chen; Jie M. Zhang; Haoyu Wang; Shuang Liu; Menghan Tian

Conference ProceedingsOPEN ACCESS

Natural Test Generation for Precise Testing of Question Answering Software

ACM International Conference Proceeding Series (2022)

DOI: 10.1145/3551349.3556953

26Citations

24Readers

Abstract

Question answering (QA) software uses information retrieval and natural language processing techniques to automatically answer questions posed by humans in a natural language. Like other AI-based software, QA software may contain bugs. To automatically test QA software without human labeling, previous work extracts facts from question answer pairs and generates new questions to detect QA software bugs. Nevertheless, the generated questions could be ambiguous, confusing, or with chaotic syntax, which are unanswerable for QA software. As a result, a relatively large proportion of the reported bugs are false positives. In this work, we proposed QAQA, a sentence-level mutation based metamorphic testing technique for QA software. To eliminate false positives and achieve precise automatic testing, QAQA leverages five Metamorphic Relations (MRs) as well as semantics-guided search and enhanced test oracles. Our evaluation on three QA datasets demonstrates that QAQA outperforms the state-of-the-art in both quantity (8,133 vs. 6,601 bugs) and quality (97.67% vs. 49% true positive rate) of the reported bugs. Moreover, the test inputs generated by QAQA successfully reduce MR violation rate from 44.29% to 20.51% when being adopted in fine-tuning the QA software under test.

Author supplied keywords

Cite

CITATION STYLE

APA

Shen, Q., Chen, J., Zhang, J. M., Wang, H., Liu, S., & Tian, M. (2022). Natural Test Generation for Precise Testing of Question Answering Software. In ACM International Conference Proceeding Series. Association for Computing Machinery. https://doi.org/10.1145/3551349.3556953

Natural Test Generation for Precise Testing of Question Answering Software

Abstract

Author supplied keywords

Cite

Register to see more suggestions