Collaborative email-spam filtering with the hashing-trick

Joshua Attenberg; Kilian Weinberger; Anirban Dasgupta; Alex Smola; Martin Zinkevich

Conference Proceedings

Collaborative email-spam filtering with the hashing-trick

6th Conference on Email and Anti-Spam, CEAS 2009 (2009)

38Citations

80Readers

Abstract

This paper delves into a recently proposed technique for collaborative spam filtering [7] that facilitates personalization with finite-sized memory guarantees. In large scale open membership email systems most users do not label enough messages for an individual local classifier to be effective, while the data is too noisy to be used for a global filter across all users. Our hybrid global/individual classifier is particularly effective at absorbing the inuence of users who label emails very differently from the general public - because of strange taste or malicious intent. We can accomplish this while still providing sufficient classifier quality to users with few labeled instances. Our proposed technique can be used with a variety of classifiers and can be implemented in a few lines of code. We verify the efficacy of our proposed technique on a popular web spam benchmark data set.

Cite

CITATION STYLE

APA

Attenberg, J., Weinberger, K., Dasgupta, A., Smola, A., & Zinkevich, M. (2009). Collaborative email-spam filtering with the hashing-trick. In 6th Conference on Email and Anti-Spam, CEAS 2009. Conference on Email and Anti-Spam, CEAS.

Collaborative email-spam filtering with the hashing-trick

Abstract

Cite

Register to see more suggestions