Abstract
Being able to identify the groups of clients that are responsible for a significant portion of a Web site's requests can be helpful to both the Web site and the clients. In a Web application, it is beneficial to move content closer to groups of clients that are responsible for large subsets of requests to an origin server. We introduce clusters - a grouping of clients that are close together topologically and likely to be under common administrative control. We identify clusters using a `network-aware' method, based on information available from BGP routing table snapshots. Experimental results show that our entirely automated approach is able to identify clusters for 99.9% of the clients in a wide variety of Web server logs. Sampled validation results show that the identified clusters meet the proposed validation tests in over 90% of the cases. An efficient self-corrective mechanism increases the applicability and accuracy of our initial approach and makes it adaptive to network dynamics. In addition to being able to detect unusual access patterns made by spiders and (suspected) proxies, our proposed method is useful for content distribution and proxy positioning, and applicable to other problems such as server replication and network management.
Cite
CITATION STYLE
Krishnamurthy, B., & Wang, J. (2000). On network-aware clustering of Web clients. Computer Communication Review, 30(4), 97–110. https://doi.org/10.1145/347057.347412
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.