In evolutionary computation, different reproduction operators have various search dynamics. To strike a well balance between exploration and exploitation, it is attractive to have an adaptive operator selection (AOS) mechanism that automatically chooses the most appropriate operator on the fly according to the current status. This paper proposes a new AOS mechanism for multi-objective evolutionary algorithm based on decomposition (MOEA/D). More specifically, the AOS is formulated as a multi-armed bandit problem where the dynamic Thompson sampling (DYTS) is applied to adapt the bandit learning model, originally proposed with an assumption of a fixed award distribution, to a non-stationary setup. In particular, each arm of our bandit learning model represents a reproduction operator and is assigned with a prior reward distribution. The parameters of these reward distributions will be progressively updated according to the performance of its performance collected from the evolutionary process. When generating an offspring, an operator is chosen by sampling from those reward distribution according to the DYTS. Experimental results fully demonstrate the effectiveness and competitiveness of our proposed AOS mechanism compared with other four state-of-the-art MOEA/D variants.
CITATION STYLE
Sun, L., & Li, K. (2020). Adaptive operator selection based on dynamic thompson sampling for moea/d. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12270 LNCS, pp. 271–284). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-58115-2_19
Mendeley helps you to discover research relevant for your work.