Automatic data extraction from web discussion forums

5Citations
Citations of this article
11Readers
Mendeley users who have this article in their library.
Get full text

Abstract

This paper presents an approach to extract information from web discussion forums automatically. HTML tag paths built from a HTML DOM tree are employed to generate the post extraction template. Visual text features and HTML structure information in the same page are also combined together to extract author profile, posted date and post content automatically. Experiment results show that our approach is effective. © 2009 IEEE.

Cite

CITATION STYLE

APA

Li, S., Tang, L., Hu, J., & Chen, Z. (2009). Automatic data extraction from web discussion forums. In 4th International Conference on Frontier of Computer Science and Technology, FCST 2009 (pp. 219–225). https://doi.org/10.1109/FCST.2009.20

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free