Abstract
This paper discusses the information mediation in the context of the automatic text summarization, it examines natural language processing techniques (NLP), and analyzes the use of techniques based on statistical methods for word processing of Brazilian Portuguese. It contextualizes the text summarization in the subject of Information Science. It proposes and explains a new method of automatic text summarization based on both NLP and statistical methods. For each of these techniques, it analyzes and exemplifies, and timely presents mathematical equations for such techniques. As results obtained in the research, we highlight an unpublished corpus annotated, composed of approximately half a million words of Brazilian Portuguese, in addition to the average results obtained with the empirical tests of the summarization tool, which indicate a reduction of dimensionality, for texts with up to 500 words, of the order of 53%. The general analysis of the research findings indicates that the results are promising in terms of the ability to reduce and preserve the semantic value of texts.
Author supplied keywords
Cite
CITATION STYLE
De Souza, O., Tabosa, H. R., De Oliveira, D. M., & De Souza Oliveira, M. H. (2017). Um método de sumarização automática de textos através de dados estatísticos e processamento de linguagem natural. Informacao e Sociedade, 27(3), 307–320. https://doi.org/10.22478/ufpb.1809-4783.2017v27n3.32571
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.