Featured Application: Hierarchical clause annotation could be applied in many downstream tasks of natural language processing, including abstract meaning representation parsing, semantic dependency parsing, text summarization, argument mining, information extraction, question answering, machine translation, etc. Most natural-language-processing (NLP) tasks suffer performance degradation when encountering long complex sentences, such as semantic parsing, syntactic parsing, machine translation, and text summarization. Previous works addressed the issue with the intuition of decomposing complex sentences and linking simple ones, such as rhetorical-structure-theory (RST)-style discourse parsing, split-and-rephrase (SPRP), text simplification (TS), simple sentence decomposition (SSD), etc. However, these works are not applicable for semantic parsing such as abstract meaning representation (AMR) parsing and semantic dependency parsing due to misalignments with semantic relations and unavailabilities to preserve the original semantics. Following the same intuition and avoiding the deficiencies of previous works, we propose a novel framework, hierarchical clause annotation (HCA), for capturing clausal structures of complex sentences, based on the linguistic research of clause hierarchy. With the HCA framework, we annotated a large HCA corpus to explore the potentialities of integrating HCA structural features into semantic parsing with complex sentences. Moreover, we decomposed HCA into two subtasks, i.e., clause segmentation and clause parsing, and provide neural baseline models for more-silver annotations. In evaluating the proposed models on our manually annotated HCA dataset, the performances of clause segmentation and parsing resulted in 91.3% F1-scores and 88.5% Parseval scores, respectively. Due to the same model architectures employed, the performance differences of the clause/discourse segmentation and parsing subtasks was reflected in our HCA corpus and compared discourse corpora, where our sentences contained more segment units and fewer interrelations than those in the compared corpora.
CITATION STYLE
Fan, Y., Li, B., Sataer, Y., Gao, M., Shi, C., Cao, S., & Gao, Z. (2023). Hierarchical Clause Annotation: Building a Clause-Level Corpus for Semantic Parsing with Complex Sentences. Applied Sciences (Switzerland), 13(16). https://doi.org/10.3390/app13169412
Mendeley helps you to discover research relevant for your work.