Semantic similarity information supports requirements tracing and helps to reveal important requirements quality defects such as redundancies and inconsistencies. Previous work has applied semantic similarity algorithms to requirements, however, we do not know enough about the performance of machine learning and deep learning models in that context. Therefore, in this work we create the largest dataset for analyzing the similarity of requirements so far through the use of Amazon Mechanical Turk, a crowd-sourcing marketplace for micro-tasks. Based on this dataset, we investigate and compare different types of algorithms for estimating semantic similarities of requirements, covering both relatively simple bag-of-words and machine learning models. In our experiments, a model which relies on averaging trained word and character embeddings as well as an approach based on character sequence occurrences and overlaps achieve the best performances on our requirements dataset.
CITATION STYLE
Femmer, H., Müller, A., & Eder, S. (2020). Semantic Similarities in Natural Language Requirements. In Lecture Notes in Business Information Processing (Vol. 371 LNBIP, pp. 87–105). Springer. https://doi.org/10.1007/978-3-030-35510-4_6
Mendeley helps you to discover research relevant for your work.