Reinforcement learning integration in dynamic power management

0Citations
Citations of this article
5Readers
Mendeley users who have this article in their library.

Abstract

Policy is the core of system-level Dynamic Power Management (DPM). All traditional policies have their own shortcomings and heavy dependency on specific workload model in terms of effectiveness. In order to solve the problem in the partially observable environment where the workload model can hardly be predicted, this paper proposes an online-learning policy based on reinforcement learning. The policy produces a workload sequence L with the historical workload and next interval workload predicted by prediction tree. And we estimate the value of all pairs composed of the sequence and the time threshold (L-T) with Q-value based on the improved Q-learning algorithm. The smallest Q-value is chosen for each sequence and the corresponding time threshold is used to be the time threshold to perform time-out policy for the DPM system. Experimental results show that the proposed policy saves more power consumption range from 3.8% to 31.4%, comparing with traditional policies, while keeping acceptable performance. © 2013 Asian Network for Scientific Information.

Cite

CITATION STYLE

APA

Liu, F. G., Lin, J. B., Xing, X. Y., Wang, B., & Lin, J. (2013). Reinforcement learning integration in dynamic power management. Journal of Applied Sciences, 13(14), 2682–2687. https://doi.org/10.3923/jas.2013.2682.2687

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free