Scalarized lower upper confidence bound algorithm

Mădălm M. Drugan

Conference Proceedings

Scalarized lower upper confidence bound algorithm

Drugan M

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2015) 8994 229-235

DOI: 10.1007/978-3-319-19084-6_21

0Citations

5Readers

Get full text

Abstract

Multi-objective evolutionary optimisation algorithms and stochastic multi-armed bandits techniques are combined in designing stochastic multi-objective multi-armed bandits (MOMAB) with an efficient exploration and exploitation trade-off. Lower upper confidence bound (LUCB) focuses on sampling the arms that are most probable to be misclassified (i.e., optimal or suboptimal arms) in order to identify the set of best arms aka the Pareto front. Our scalarized multi-objective LUCB (sMO-LUCB) is an adaptation of LUCB to reward vectors. Preliminary empirical results show good performance of the proposed algorithm on a bi-objective environment.

Cite

CITATION STYLE

APA

Drugan, M. M. (2015). Scalarized lower upper confidence bound algorithm. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8994, pp. 229–235). Springer Verlag. https://doi.org/10.1007/978-3-319-19084-6_21

Scalarized lower upper confidence bound algorithm

Abstract

Cite

Register to see more suggestions