Multilingual summarization task aims to develop summarization systems that are fully or partly language free. Extractive techniques are at the center of such systems. They use statistical features to score and extract most relevant sentences to form a summary within a size limit. In this paper, we investigate recently released multilingual distributed word representations combined with mRMR discriminant analysis to score terms then sentences. We also propose a novel sentence extraction algorithm to deal with redundancy issue. We present experimental results of our system applied to three languages: English, Arabic and French using the TAC MultiLing 2011 Dataset. Our results demonstrate that word representations enhance the summarization system, MeMoG and ROUGE results are comparable to recent state-of-theart systems.
CITATION STYLE
Oufaida, H., Blache, P., & Nouali, O. (2015). Using distributed word representations and mRMR discriminant analysis for multilingual text summarization. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9103, pp. 51–63). Springer Verlag. https://doi.org/10.1007/978-3-319-19581-0_4
Mendeley helps you to discover research relevant for your work.