This paper presents an approach to normalize documents in constrained domains. This approach reuses resources developed for controlled document authoring and is decomposed into three phases. First, candidate content representations for an input document are automatically built. Then, the content representation that best corresponds to the document according to an expert of the class of documents is identified. This content representation is finally used to generate the normalized version of the document. The current version of our prototype system is presented, and its limitations are discussed.
CITATION STYLE
Max, A. (2004). From controlled document authoring to interactive document normalization. In COLING 2004 - Proceedings of the 20th International Conference on Computational Linguistics. Association for Computational Linguistics (ACL). https://doi.org/10.3115/1220355.1220521
Mendeley helps you to discover research relevant for your work.