The huge amount of documents in digital formats raised the need of effective content-based retrieval techniques. Since manual indexing is infeasible and subjective, automatic techniques are the obvious solution. In particular, the ability of properly identifying and understanding a document's structure is crucial, in order to focus on the most significant components only. Thus, the quality of the layout analysis outcome biases the next understanding steps. Unfortunately, due to the variety of document styles and formats, the automatically found structure often needs to be manually adjusted. In this work we present a tool based on Markov Logic Networks to infer corrections rules to be applied to forthcoming documents. The proposed tool, embedded in a prototypical version of the document processing system DOMINUS, revealed good performance in real-world experiments. © 2011 Springer-Verlag.
CITATION STYLE
Ferilli, S., Basile, T. M. A., & Di Mauro, N. (2011). Markov logic networks for document layout correction. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6703 LNAI, pp. 275–284). https://doi.org/10.1007/978-3-642-21822-4_28
Mendeley helps you to discover research relevant for your work.