Parabel: Partitioned label trees for extreme classification with application to dynamic search advertising

233Citations
Citations of this article
97Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

This paper develops the Parabel algorithm for extreme multi-label learning where the objective is to learn classifiers that can annotate each data point with the most relevant subset of labels from an extremely large label set. The state-of-the-art 1-vs-All based DiSMEC and PPDSparse algorithms are the most accurate but can take upto months for training and prediction as they learn and apply an independent linear classifier per label. Consequently, they do not scale to large datasets with millions of labels. Parabel addresses both limitations by learning a balanced label hierarchy such that: (a) the 1-vs-All classifiers in the leaf nodes of the label hierarchy can be trained on a small subset of the training set thereby reducing the training time to a few hours on a single core of a standard desktop and (b) novel points can be classified by traversing the learned hierarchy in logarithmic time and applying the 1-vs-All classifiers present in just the leaf thereby reducing the prediction time to a few milliseconds per test point. This allows Parabel to scale to tasks considered infeasible for DiSMEC and PPDSparse such as predicting the subset of 7 million Bing queries that might lead to a click on a given ad-landing page for dynamic search advertising. Experiments on multiple benchmark datasets revealed that Parabel could be almost as accurate as PPDSparse and DiSMEC while being upto 1,000x faster at training and upto 40x-10,000x faster at prediction. Furthermore, Parabel was demonstrated to significantly improve dynamic search advertising on Bing by more than doubling the ad recall and improving the click-through rate by 20%. Source code for Parabel can be downloaded from [1].

Cite

CITATION STYLE

APA

Prabhu, Y., Kag, A., Harsola, S., Agrawal, R., & Varma, M. (2018). Parabel: Partitioned label trees for extreme classification with application to dynamic search advertising. In The Web Conference 2018 - Proceedings of the World Wide Web Conference, WWW 2018 (pp. 993–1002). Association for Computing Machinery, Inc. https://doi.org/10.1145/3178876.3185998

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free