A novel approach of augmenting training data for legal text segmentation by leveraging domain knowledge

Rupali Sunil Wagh; Deepa Anand

Conference Proceedings

A novel approach of augmenting training data for legal text segmentation by leveraging domain knowledge

Advances in Intelligent Systems and Computing (2020) 910 53-63

DOI: 10.1007/978-981-13-6095-4_4

6Citations

13Readers

Get full text

Abstract

In this era of information overload, text segmentation can be used effectively to locate and extract information specific to users’ need within the huge collection of documents. Text segmentation refers to the task of dividing a document into smaller labeled text fragments according to the semantic commonality of the contents. Due to the presence of rich semantic information in legal text, text segmentation becomes very crucial in legal domain for information retrieval. But such supervised classification requires huge training data for building efficient classifier. Collecting and manually annotating gold standards in NLP is very expensive. In recent past the question of whether we can satisfactorily replace them with automatically annotated data is arising more and more interest. This work presents two approaches entirely based in domain knowledge for automatic generation of training data which can further be used for segmentation of court judgments.

Author supplied keywords

Cite

CITATION STYLE

APA

Wagh, R. S., & Anand, D. (2020). A novel approach of augmenting training data for legal text segmentation by leveraging domain knowledge. In Advances in Intelligent Systems and Computing (Vol. 910, pp. 53–63). Springer Verlag. https://doi.org/10.1007/978-981-13-6095-4_4

A novel approach of augmenting training data for legal text segmentation by leveraging domain knowledge

Abstract

Author supplied keywords

Cite

Register to see more suggestions