© 2018, Universiti Kebangsaan Malaysia Press. All rights reserved. Headline summarization is one of the automated text summarization techniques that can reduce the problem of information overload in the retrieval system and reduce the user's cognitive burden while searching and selecting relevant documents in large quantities. This study discusses the process on the determination of Malay language system features in the news genre document. Methodology starts with analysis the corpus of Malay news documents. The corpus contains 140 core news items which were selected from the two mainstream news databases in Malaysia which are Berita Harian and Utusan Malaysia. The selection news criteria are from core news categories, sized 50 to 250 words, the years of publication from 2007 to 2012 and news genres from economic, crime, education and sports. Three linguistic experts in Malay produced a headline summary for each news document manually. The experts need to comply with three conditions which are summary extraction, select-word-inorder word selection techniques and word morphological changes. The experimental results show that three characteristics have been identified, first: the first two sentenses are the important sentences, second: the verse that contains the potential acronym definitions is chosen as the most important sentence and third: the size of the summary of the ideal headline is six words. The consideration of this feature allows a summary of the headline that can be generated automatically, just like the process done by human.
CITATION STYLE
Mohd Noah, S. A., Mohamad Ali, N., & Hasan, M. S. (2018). Penentuan Fitur bagi Pengekstrakan Tajuk Berita Akhbar Bahasa Melayu (Determining Features of News Headline in Malay News Document). GEMA Online® Journal of Language Studies, 18(2), 154–167. https://doi.org/10.17576/gema-2018-1802-11
Mendeley helps you to discover research relevant for your work.