No-regret algorithms for online convex optimization are potent online learning tools and have been demonstrated to be successful in a wide-ranging number of applications. Considering affine and external regret, we investigate what happens when a set of no-regret learners (voters) merge their respective decisions in each learning iteration to a single, common one in form of a convex combination. We show that an agent (or algorithm) that executes this merged decision in each iteration of the online learning process and each time feeds back a copy of its own reward function to the voters, incurs sublinear regret itself. As a by-product, we obtain a simple method that allows us to construct new no-regret algorithms out of known ones. © 2009 Springer Berlin Heidelberg.
CITATION STYLE
Calliess, J. P. (2009). On fixed convex combinations of no-regret learners. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5632 LNAI, pp. 494–504). https://doi.org/10.1007/978-3-642-03070-3_37
Mendeley helps you to discover research relevant for your work.