A novel approach of augmenting training data for legal text segmentation by leveraging domain knowledge

6Citations
Citations of this article
13Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In this era of information overload, text segmentation can be used effectively to locate and extract information specific to users’ need within the huge collection of documents. Text segmentation refers to the task of dividing a document into smaller labeled text fragments according to the semantic commonality of the contents. Due to the presence of rich semantic information in legal text, text segmentation becomes very crucial in legal domain for information retrieval. But such supervised classification requires huge training data for building efficient classifier. Collecting and manually annotating gold standards in NLP is very expensive. In recent past the question of whether we can satisfactorily replace them with automatically annotated data is arising more and more interest. This work presents two approaches entirely based in domain knowledge for automatic generation of training data which can further be used for segmentation of court judgments.

Cite

CITATION STYLE

APA

Wagh, R. S., & Anand, D. (2020). A novel approach of augmenting training data for legal text segmentation by leveraging domain knowledge. In Advances in Intelligent Systems and Computing (Vol. 910, pp. 53–63). Springer Verlag. https://doi.org/10.1007/978-981-13-6095-4_4

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free