Request latency is a critical metric in determining the usability of web services. The latency of a request includes service time - the time when the request is being actively serviced - and waiting time - the time when the request is waiting to be served. Most existing works aim to reduce request latency by focusing on reducing the mean service time (that is, shortening the critical path). In this paper, we explore an alternative approach to reducing latency - using variability as a guiding principle when designing web services. By tracking the service time variability of the request as it traverses across software layers within the user and kernel space of the web server, we identify the most critical stages of request processing. We then determine control knobs in the OS and application, such as thread scheduling and request batching, that regulate the variability in these stages, and demonstrate that tuning these specific knobs can significantly improve end-to-end request latency. Our experimental results with Memcached and Apache web server under different request rates, including real-world traces, show that this alternative approach can reduce mean and tail latency by 30-50%.
CITATION STYLE
Suresh, A., & Gandhi, A. (2019). Using variability as a guiding principle to reduce latency in web applications via OS profiling. In The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019 (pp. 1759–1770). Association for Computing Machinery, Inc. https://doi.org/10.1145/3308558.3313406
Mendeley helps you to discover research relevant for your work.