On Analyzing Third-party Tracking via Machine Learning

Alfonso Guarino; Delfina Malandrino; Rocco Zaccagnino; Federico Cozza; Antonio Rapuano

Conference ProceedingsOPEN ACCESS

On Analyzing Third-party Tracking via Machine Learning

International Conference on Information Systems Security and Privacy (2020) 532-539

DOI: 10.5220/0008972005320539

0Citations

9Readers

Get full text

Abstract

Nowadays, websites rely on services provided by third party sites to track users and offer personalized experiences. However, this practice threatens the privacy of individuals through the use of valuable information to create a digital personal profile. The existing client-side countermeasures to protect privacy, exhibit performance issues, mainly due to the use of blacklisting mechanisms (list of resources to be filtered out). In this paper, we study the use of machine learning methods to classify the thirdy-party privacy intrusive resources (trackers). To this end, we first downloaded (browsing Alexa’s Top 10 websites for each category like sports, shopping etc.) a dataset of 1000 web resources split into functional and tracking, and then we identified suitable metrics to distinguish between the two classes. In order to evaluate the effectiveness of the proposed metrics we have compared the performances of several machine learning models based on supervised learning among the most used in literature. As a result, we obtained that the Random Forest can classify functional and tracking resources with an accuracy of 91%.

Author supplied keywords

Cite

CITATION STYLE

APA

Guarino, A., Malandrino, D., Zaccagnino, R., Cozza, F., & Rapuano, A. (2020). On Analyzing Third-party Tracking via Machine Learning. In International Conference on Information Systems Security and Privacy (pp. 532–539). Science and Technology Publications, Lda. https://doi.org/10.5220/0008972005320539

On Analyzing Third-party Tracking via Machine Learning

Abstract

Author supplied keywords

Cite

Register to see more suggestions