Abstract
Policy is the core of system-level Dynamic Power Management (DPM). All traditional policies have their own shortcomings and heavy dependency on specific workload model in terms of effectiveness. In order to solve the problem in the partially observable environment where the workload model can hardly be predicted, this paper proposes an online-learning policy based on reinforcement learning. The policy produces a workload sequence L with the historical workload and next interval workload predicted by prediction tree. And we estimate the value of all pairs composed of the sequence and the time threshold (L-T) with Q-value based on the improved Q-learning algorithm. The smallest Q-value is chosen for each sequence and the corresponding time threshold is used to be the time threshold to perform time-out policy for the DPM system. Experimental results show that the proposed policy saves more power consumption range from 3.8% to 31.4%, comparing with traditional policies, while keeping acceptable performance. © 2013 Asian Network for Scientific Information.
Author supplied keywords
Cite
CITATION STYLE
Liu, F. G., Lin, J. B., Xing, X. Y., Wang, B., & Lin, J. (2013). Reinforcement learning integration in dynamic power management. Journal of Applied Sciences, 13(14), 2682–2687. https://doi.org/10.3923/jas.2013.2682.2687
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.