Construction of linguistic resources for information extraction of news reports on corporate merger and acquisition

0Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Detecting real time corporate merger and acquisition information from publicly available text data and feeding it to the decision-making module are essential for an applicable e-business management system. There are plenty of machine learning algorithms in text categorization and information extraction to address the problem, requiring different feature selection methods. Among them, linguistic features are key issues to accomplish this task. The acquisition of IBM's PC division by Lenovo in 2004 was chosen as a case, and news reports of this event were gathered from the Internet to build a Corporate Merger and Acquisition (M&A) mini-corpus. Comparing the M&A corpus with larger general corpora, we constructed a feature word list by applying Term Frequency and Inverse Document Frequency strategies and augmented it by introduction of the word groups from thesauri. Typical patterns, which highlighted the event of M&A, were collected by employing regular expression matching on these words acquired in the former step. By means of the accumulated language resources, the precision and recall for predicting the amount of the M&A in Chinese and English texts are 61.76 %, 65.22 % and 84 %, 71.43 % respectively. © 2013 Springer Science+Business Media New York.

Cite

CITATION STYLE

APA

Xiong, W. (2013). Construction of linguistic resources for information extraction of news reports on corporate merger and acquisition. In Lecture Notes in Electrical Engineering (Vol. 236 LNEE, pp. 1231–1238). Springer Verlag. https://doi.org/10.1007/978-1-4614-7010-6_137

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free