Extracting Information from Informal Communication

Jason D M Rennie; Arthur C Smith

Journal Article

Extracting Information from Informal Communication

Rennie J
Smith A

Electrical Engineering (2007) I(1999) 93

N/ACitations

23Readers

Abstract

This thesis focuses on the problem of extracting information from informal communication. Textual informal communication, such as e-mail, bulletin boards and blogs, has become a vast information resource. However, such information is poorly organized and diﬃcult for a computer to understand due to lack of editing and structure. Thus, techniques which work well for formal text, such as newspaper articles, may be considered insuﬃcient on informal text. One focus of ours is to attempt to advance the state-of-the-art for sub-problems of the information extraction task. We make contributions to the problems of named entity extraction, co-reference resolution and context tracking. We channel our eﬀorts toward methods which are particularly applicable to informal communication. We also consider a type of information which is somewhat unique to informal communication: preferences and opinions. Individuals often expression their opinions on products and services in such communication. Others’ may read these “reviews” to try to predict their own experiences. However, humans do a poor job of aggregating and generalizing large sets of data. We develop techniques that can perform the job of predicting unobserved opinions. We address both the single-user case where information about the items is known, and the multi-user case where we can generalize opinions without external information. Experiments on large- scale rating data sets validate our approach.

Cite

CITATION STYLE

APA

Rennie, J. D. M., & Smith, A. C. (2007). Extracting Information from Informal Communication. Electrical Engineering, I(1999), 93.

Extracting Information from Informal Communication

Abstract

Cite

Register to see more suggestions