Visual saliency and terminology extraction for document classification

Duthil Benjamin; Coustaty Mickael; Courboulay Vincent; Jean Marc Ogier

Journal Article

Visual saliency and terminology extraction for document classification

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2014) 8746 96-108

DOI: 10.1007/978-3-662-44854-0_8

0Citations

7Readers

Get full text

Abstract

The document digitization process becomes a crucial economical issue in our society. Then, it becomes necessary to be able to organize this huge amount of documents. The work proposed in this paper tends to propose a new method to automatically classify documents using a saliency-based segmentation process on one hand, and a terminology extraction and annotation on the other hand. The saliencybased segmentation is used to extract salient regions and by the way logo, while the terminology approach is used to annotate them and to automatically classify the document. The approach does not require human expertise, and use Google Images as a knowledge database. The results obtained on a real database of 1766 documents show the relevance of the approach.

Cite

CITATION STYLE

APA

Benjamin, D., Mickael, C., Vincent, C., & Ogier, J. M. (2014). Visual saliency and terminology extraction for document classification. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 8746, 96–108. https://doi.org/10.1007/978-3-662-44854-0_8

Visual saliency and terminology extraction for document classification

Abstract

Cite

Register to see more suggestions