Customer lifetime value (CLV) is the most reliable indicator in direct marketing for measuring the profitability of the customers. This motivated the researchers to compete in building models to maximize CLV and consequently, enhancing the firm, and the customer relationship. This review paper analyzes the contributions of applying dynamic programming models in the area of direct marketing, to maximize CLV. It starts by reviewing the basic models that focused on calculating CLV, measuring it, simulating, optimizing it or -rarely- maximizing its value. Then highlighting the dynamic programming models including, Markov Decision Process (MDP), Approximate Dynamic Programming (ADP), also called Reinforcement Learning (RL), Deep RL and Double Deep RL. Although, MDP contributed significantly in the area of maximizing CLV, it has many limitations that encouraged researchers to utilize ADP (i.e. RL) and recently deep reinforcement learning (i.e. deep Q network). These algorithms overcame the limitations of MDP and were able to solve complex problems without suffering from the curse of dimensionality problem, however they still have some limitations including, overestimating the action values. This was the main motivation behind proposing double deep Q networks (DDQN). Meanwhile, neither DDQN nor the algorithms that outperformed it and overcame its limitations were applied in the area of direct marketing and this leaves a space for future research directions.
CITATION STYLE
AboElHamd, E., Shamma, H. M., & Saleh, M. (2020). Dynamic programming models for maximizing customer lifetime value: An overview. In Advances in Intelligent Systems and Computing (Vol. 1037, pp. 419–445). Springer Verlag. https://doi.org/10.1007/978-3-030-29516-5_34
Mendeley helps you to discover research relevant for your work.