A Multi-Depot Dynamic Vehicle Routing Problem with Stochastic Road Capacity: An MDP Model and Dynamic Policy for Post-Decision State Rollout Algorithm in Reinforcement Learning

9Citations
Citations of this article
27Readers
Mendeley users who have this article in their library.

Abstract

In the event of a disaster, the road network is often compromised in terms of its capacity and usability conditions. This is a challenge for humanitarian operations in the context of delivering critical medical supplies. To optimise vehicle routing for such a problem, a Multi-Depot Dynamic Vehicle-Routing Problem with Stochastic Road Capacity (MDDVRPSRC) is formulated as a Markov Decision Processes (MDP) model. An Approximate Dynamic Programming (ADP) solution method is adopted where the Post-Decision State Rollout Algorithm (PDS-RA) is applied as the lookahead approach. To perform the rollout effectively for the problem, the PDS-RA is executed for all vehicles assigned for the problem. Then, at the end, a decision is made by the agent. Five types of constructive base heuristics are proposed for the PDS-RA. First, the Teach Base Insertion Heuristic (TBIH-1) is proposed to study the partial random construction approach for the non-obvious decision. The heuristic is extended by proposing TBIH-2 and TBIH-3 to show how Sequential Insertion Heuristic (SIH) (I1) as well as Clarke and Wright (CW) could be executed, respectively, in a dynamic setting as a modification to the TBIH-1. Additionally, another two heuristics: TBIH-4 and TBIH-5 (TBIH-1 with the addition of Dynamic Lookahead SIH (DLASIH) and Dynamic Lookahead CW (DLACW) respectively) are proposed to improve the on-the-go constructed decision rule (dynamic policy on the go) in the lookahead simulations. The results obtained are compared with the matheuristic approach from previous work based on PDS-RA.

Cite

CITATION STYLE

APA

Anuar, W. K., Lee, L. S., Seow, H. V., & Pickl, S. (2022). A Multi-Depot Dynamic Vehicle Routing Problem with Stochastic Road Capacity: An MDP Model and Dynamic Policy for Post-Decision State Rollout Algorithm in Reinforcement Learning. Mathematics, 10(15). https://doi.org/10.3390/math10152699

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free