Online and stochastic gradient methods for non-decomposable loss functions

Purushottam Kar; Harikrishna Narasimhan; Prateek Jain

Conference Proceedings

Online and stochastic gradient methods for non-decomposable loss functions

Advances in Neural Information Processing Systems (2014) 1(January) 694-702

ISSN: 10495258

48Citations

75Readers

Abstract

Modern applications in sensitive domains such as biometrics and medicine frequently require the use of non-decomposable loss functions such as, precision@k, F-measure etc. Compared to point loss functions such as hinge-loss, these offer much more fine grained control over prediction, but at the same time present novel challenges in terms of algorithm design and analysis. In this work we initiate a study of online learning techniques for such non-decomposable loss functions with an aim to enable incremental learning as well as design scalable solvers for batch problems. To this end, we propose an online learning framework for such loss functions. Our model enjoys several nice properties, chief amongst them being the existence of efficient online learning algorithms with sublinear regret and online to batch conversion bounds. Our model is a provable extension of existing online learning models for point loss functions. We instantiate two popular losses, Prec @k and pAUC, in our model and prove sublinear regret bounds for both of them. Our proofs require a novel structural lemma over ranked lists which may be of independent interest. We then develop scalable stochastic gradient descent solvers for non-decomposable loss functions. We show that for a large family of loss functions satisfying a certain uniform convergence property (that includes, Prec @k, pAUC, and F-measure), our methods provably converge to the empirical risk minimizer. Such uniform convergence results were not known for these losses and we establish these using novel proof techniques. We then use extensive experimentation on real life and benchmark datasets to establish that our method can be orders of magnitude faster than a recently proposed cutting plane method.

Cite

CITATION STYLE

APA

Kar, P., Narasimhan, H., & Jain, P. (2014). Online and stochastic gradient methods for non-decomposable loss functions. In Advances in Neural Information Processing Systems (Vol. 1, pp. 694–702). Neural information processing systems foundation.

Online and stochastic gradient methods for non-decomposable loss functions

Abstract

Cite

Register to see more suggestions