Tracking users within and across websites is the base for profiling their interests, demographic types, and other information that can be monetised through targeted advertising and big data analytics. The advent of HTTPS was supposed to make profiling harder for anyone beyond the communicating end-points. In this paper we examine to what extent the above is true. We first show that by knowing the domain that a user visits, either through the Server Name Indication of the TLS protocol or through DNS, an eavesdropper can already derive basic profiling information, especially for domains whose content is homogeneous. For domains carrying a variety of categories that depend on the particular page that a user visits, e.g., news portals, e-commerce sites, etc., the basic profiling technique fails. Still, accurate profiling remains possible through traffic fingerprinting that uses network traffic signatures to infer the exact page that a user is browsing, even under HTTPS. We demonstrate that transport-layer fingerprinting remains robust and scalable despite hurdles such as caching, dynamic content for different device types etc. Overall our results indicate that although HTTPS makes profiling more dificult, it does not eradicate it by any means.
CITATION STYLE
Gonzalez, R., Soriente, C., & Laoutaris, N. (2016). User profiling in the time of HTTPS. In Proceedings of the ACM SIGCOMM Internet Measurement Conference, IMC (Vol. 14-16-November-2016, pp. 373–379). Association for Computing Machinery. https://doi.org/10.1145/2987443.2987451
Mendeley helps you to discover research relevant for your work.