Dynamic Coordination of Energy and Hops in WSNs Using Reinforcement Learning Routing Algorithm

  • Li J
  • Wei H
N/ACitations
Citations of this article
5Readers
Mendeley users who have this article in their library.

Abstract

In wireless sensor network, the existing reinforcement learning routing algorithm usually optimize single goal and the process of route establishment is complex. It also has problem of data forwarding control overhead. In this paper, we present a dynamic adaptive routing algorithm with feedback learning ability to balance the energy of wireless sensor network, to reduce the routing hops, and to reduce the establishment complexity. The algorithm will use the local routing information and the method of feedback to learn neighbors’ state; routing reward values will be obtained by weighted calculation according to the energy information and the hop counts information; the optimal routing strategy will be obtained by updating the Q-value of routing table. Introduction Wireless sensor networks (WSNs) is a self-organized network which is composed of a large number of sensor nodes. In many fields such as military, industry and agriculture, it has a very broad prospects for development[1, 2]. Because WSNs is often limited by the energy supply and communication ability, so the design of routing algorithm is often challenging. Efficient usage of energy and reduction of hop counts of data transmission are important goals in WSNs routing algorithm design Common WSNs routing protocols can be classified into four categories. Representative of the first kind is Flooding[3], it’s forwarding rules is simple and implementation is easy. Flooding can reduce consumption of computational resources without the network topology information and complex routing discovery algorithm. But there are problems such that information explosion. A typical example of the second category is hierarchical routing which is a low energy adaptive clustering routing (LEACH)[4]. It need to control network topology with election of cluster nodes and it is responsible for the data fusion to reduce data traffic[5]. But cluster grouping brings extra overhead and coverage problem. The third class is data-centric routing which is based on the query, it’s classic representative is DD (Directed Diffusion)[6] whose nodes and neighbor nodes communication without global topological information of each node, that reduce the amount of data traffic and energy consumption. But it’s build of gradient is costly, and it’s naming mechanism limits the range of application. The fourth class is based on the location information of routing, typical representative is GEAR[7]. It is combined with the geography information to forward data to the destination node according to the corresponding strategy which reduces routing overhead. But the node location need GPS which increase the cost. Learning algorithm has distributed autonomous behavior and adaptability to environmental change, that making them very suitable for wireless sensor networks[8]. Literature [9] proposed a distributed reinforcement learning algorithm (DIRL) to enable sensor nodes to autonomous learning, to reduce energy consumption of finding routing path. Egorova Forster in [10] presents a learning algorithm based on feedback, the algorithm mainly use the feedback information to collect neighbor nodes routing information at the time of sink node statement, it make data reach more sink node effectively and avoid to send more data copy. But this method may get a longer path of data transmission. In order to maintain the network node energy balance, literature [11] proposed a routing algorithm based on reinforcement learning (RL), it adjust data transmission path dynamically to avoid a single node energy depletion, that make whole network energy equilibrium, International Conference on Information Sciences, Machinery, Materials and Energy (ICISMME 2015) © 2015. The authors Published by Atlantis Press 1348 prolong the network lifetime. Combining with Q-learning algorithm[12], this paper proposed a routing strategy based on feedback learning, a kind of Energy Consumption Balance and hops Less Adaptation Routing Algorithm (ECBHLA). It make WSNs energy equilibrium and reduce the hops of data transmission that will improve the overall network performance and enhanced the robustness of the network. The algorithm has very low overhead of routing and it is completely distributed. Routing Algorithm With Feedback Based On Reinforcement Learning Common WSNs routing algorithm is used to collect data from a source node, and then spread to a sink node. In the process of data transmission, one neighbor node can be chosen to forward data. The choices of neighbor nodes determine the routing path. Fig. 1 show a routing path from the source node to the sink node. Reinforcement learning algorithm for WSNs routing path finding process is similar to the markov decision process (MDP), MDP include:

Cite

CITATION STYLE

APA

Li, J., & Wei, H. (2015). Dynamic Coordination of Energy and Hops in WSNs Using Reinforcement Learning Routing Algorithm. In Proceedings of the First International Conference on Information Sciences, Machinery, Materials and Energy (Vol. 126). Atlantis Press. https://doi.org/10.2991/icismme-15.2015.289

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free