Many people share their daily events and opinions on Twitter. Some are beneficial and comment on several aspects of auser's real life, i.e., eating, traffic conditions, weather, and soon. Since some tweets indicate two or more aspects, multilabel classification is required. Typical methods are not performed on tweets because they consist of short and elided sentences. To conquer these problems, we are researching a hierarchical estimation framweork (HEF) to estimate several aspects of unknown tweets. HEF is composed of both unsupervised and supervised machine learnings. In the first phase, it extracts topics from a sea of tweets using latent dirichlet allocation (LDA). In the second phase, it calculates the relevance between topcis and aspects using a small set of labeled tweets to build associations among them. In this paper, we introduce the entropy feedback method in the second phase. We evaluate the Shannon entropy of each association between the aspects and topics and iteratively calculate the feedback coefficients by entropy to achieve optimal associations. Our sophisticated experimental evaluations with a large amount of actual tweets demonstrate the high efficiency of our multi labeling method. Our entropy feedback method successfully increased higher F-measures in all aspects. Expecially in Disaster and Traffic aspects, precision greatly increased without decreasing recall.
CITATION STYLE
Yamamoto, S., & Satoh, T. (2015). Hierarchical estimation framework of multi-label classifying: A case of tweets classifying into real life aspects. In Proceedings of the 9th International Conference on Web and Social Media, ICWSM 2015 (pp. 523–532). AAAI Press. https://doi.org/10.1609/icwsm.v9i1.14592
Mendeley helps you to discover research relevant for your work.