— Advances in the technology along with reduction in processor size, its memory, and wireless antenna size has facilitated the construction of low cost, low powered and multifunctional Sensor nodes which in turn led to high demand for development of Wireless Sensor Networks. A lot of research work has been done regarding the development of routing protocols for WSNs. This paper provides a brief overview of the routing protocols using Reinforcement learning approach for WSNs. Keywords— Reinforcement Learning (RL), Wireless Sensor Network (WSN), Routing Protocols. I. INTRODUCTION The Micro-Electro Mechanical Systems technology has attracted the world-wide attention towards the development of WSNs and the various application areas it can cover. A wireless sensor network [1] consists of a collection of autonomous sensors densely deployed in a structured or unstructured manner to monitor a physical environment and gather the relevant information and cooperatively and coordinately transfer it through the network to the sink through gateway nodes. The dense deployment helps to obtain accurate data and at the samw time achieve speed, flexibility, and reliability in the given environment. The research on sensor networks started at DARPA with DSN program by about 1980 and its beginning is marked by the 1998 SmartDust project. These networks have found a major use and application in the field of Optimal Control Systems. They are also now used in other applications related to monitoring and tracking activities, such as area monitoring, greenhouse monitoring, structural monitoring, passive localization and tracking activities, etc. The main features of wireless sensor networks are Embedded routers, Dense connectivity, Resource constrained nodes, Asymmetric links, Dynamic topology, Broadcast communication paradigm, Heterogeneity of nodes, Withstand harsh environments, Autonomous in nature, Infrastructure less and self operable, and Multi-hop routing. The main challenges faced by wireless sensor networks are frequent topology changes, limited battery, limited capacity, limited memory, prone to failures, no global ID, and adaptation. Reinforcement learning approach [2] addresses mainly 2 problems namely, Prediction and Control problems. It is now being widely used by the routing protocols, at the network layer, to handle resource constraints. The most widely used reinforcement learning algorithm is Q-Learning approach. The major advantage of RL based routing is that each node does not need global information of the network, but still it can approximate global optimality. The focus of RL techniques is mainly to find an optimal path and increase residual energy of the network, which helps to prolong network's lifetime and its efficiency. The routing approach used in the network depends upon the aspect being given importance or the QoS required for a particular application area. The remaining sections provides a brief overview about the routing protocols for WSNs based on Reinforcement Learning approach. II. REINFORCEMENT LEARNING AND WIRELESS SENSOR NETWORKS Routing can be defined as efficiently transmitting data over the network, simultaneously considering the other factors such as energy consumption, cost, quality of service, network lifetime, etc. The four important factors to be considered are energy cost, robustness, throughput and delay. Most of the RL based protocols finds optimal paths and prolonging network lifetime by being energy-aware (balancer or saver approach) or by increasing residual energy uniformly across the network. The routing strategies can grouped into being structured and Structure-less. Some greedy based approaches are GPSR, CADR, etc and some search based approaches are GEAR, Q-Routing [3], etc. Adaptive Tree Protocol (ATP) is between structured and structure-less. The routing strategies can also be grouped on the basis of aspect being focused such as location-based, feedback-based, Energy-awareness, fault-tolerance, or cost effectiveness. Thus, on the basis of adaptiveness, routing mechanisms, coordinating agent, and aspect being focused, the routing protocols or approaches can be categorized as follows: A. On the basis of Adaptivity Adaptivity is the most desired feature for a dynamic environment. The dynamic behavior can be due to dynamic topologies, application requirements, power requirements, and traffic patterns. It can be obtained through the use of: 1) Adaptive Routing Schemes: It includes Adaptive Routing [4, 5] which characterizes the route paths by their sinks or destinations and change in the route paths due to dynamic network conditions. Adaptive routing (AdaR) [6], Adaptive Tree Protocol (ATP) [7], etc. comes under such schemes. 2) Cross-Layer protocols: These protocols help as to achieve competing goals of energy-efficiency and flexibility through specialization through cross-layer protocol design and through modularity in a layered protocol design, respectively. The main logic behind cross-layer design (CLD) is to use the information from multiple layers to jointly optimize performance of those layers. XLM (unified), MAC-CROSS [8], CLEEP [9], RL-MAC [10] [11], DReL, DIRL [12], etc. are some of the cross-layer protocols using reinforcement learning concept. 3) Learning automata schemes: This scheme can be used to model the learning systems [13] and also it doesn't require the information of the environment it operates in. It helps the intelligent automata equipped sensor nodes to learn or adapt to the environment as it changes and become intelligent with time. Some of the LA based protocols are AEESPAN [14], SARA [15] [16], FEAR [17], etc. B. On the basis of routing mechanisms On the basis of routing mechanisms, protocols can be classified as structure-less and structure-based. The Structure based mechanisms use some data structure such as a Routing table to store information, updated periodically or on-demand, that can be used to take routing decisions, later on. These mechanisms are suitable for stable networks. Real-time search approaches such as ant routing, TD methods etc, flooding based mechanisms and greedy approaches are some structure-less mechanisms for dynamic environment. Adaptive Tree Protocol (ATP) [7] comes under both structured and structure-less mechanisms. C. On the basis of coordinating agent To get a more accurate model for a large number of sensor nodes, a multi-agent system approach can to be adopted. The two kinds of coordination based Reinforcement Learning-based approaches are: 1) Single-agent reinforcement learning (SARL): It is suitable for centralized networks. Example: COORD. 2) Multi-agent reinforcement learning (MARL): It is suitable for distributed networks. The MARL approaches can further be categorized in terms of the no. of hops involved in the payoff message propagation as Single-hop coordination-based MARL and Multiple-hop coordination-based MARL. D. On the basis of Aspect being focused Energy-aware The energy-aware routing is used to minimize energy waste and maximize network lifetime by utilizing resource uniformly. The packets are routed such that energy consumption is distributed uniformly around a forwarding node. It is used for handling optimization problems. The factors affecting energy consumption are discussed in [18], such as routing path length, link reliability, aggregation, load balance, etc. The energy-aware protocols can further be categorized as Energy-saver and Energy-balancer. RLGR [19], DRLR [1], etc are some examples of energy-aware protocols.
CITATION STYLE
Arya, A. (2018). Reinforcement Learning based Routing Protocols in WSNs: A Survey. International Journal for Research in Applied Science and Engineering Technology, 6(4), 3523–3529. https://doi.org/10.22214/ijraset.2018.4584
Mendeley helps you to discover research relevant for your work.