Various website fingerprinting attacks (WF) have been developed to detect anonymous users accessing illegal websites in Tor networks by analyzing Tor traffic. These attacks consider several traffic features, such as packet length, number of packets, and time, to identify users who attempt to access prohibited content. Due to the advance of artificial intelligence (AI) technologies, machine learning or deep learning techniques have been widely adopted for WF to generate an accurate model to break the privacy of illegal users. Nevertheless, such state-of-the-art approaches to WF assumed that entire data from various Tor nodes are collected and trained in a centralized way to generate the model: However, training data sets from Tor nodes may contain sensitive information that the Tor nodes may not want to share. In addition, significant computing and network bottleneck at the centralized server is inevitable in collecting and training various data in a centralized manner. Correspondingly, this paper proposes a novel framework using federated learning (FL) for WF in the Tor network (denoted as FedFingerprinting), enabling Tor nodes to generate the global model collaboratively without exposing their local data sets. Specifically, to alleviate the burden for local training of selected Tor nodes in the FL process, the importance of various handcrafting features used for WF is firstly evaluated through the analysis of the accuracy of features under the ensemble of tree machine learning methods. Then, to balance the accuracy and training time, the combination of selected top-ranked features is trained using FL approaches rather than raw data in the model. Moreover, considering the local model accuracy of each Tor node, effective Tor node selection for the FL process is also designed. Finally, under closed-world settings with the real-world Tor data sets, we empirically demonstrate the comparisons of the proposed FedFingerprinting with raw data and feature selection compared to various benchmarks in terms of the training time and accuracy. Then, the superior performance of the FedFingerprinting with Tor node selection is evaluated in terms of convergence speed.
CITATION STYLE
Bang, J., Jeong, J., & Lee, J. (2023). FedFingerprinting: A Federated Learning Approach to Website Fingerprinting Attacks in Tor Networks. IEEE Access, 11, 78431–78444. https://doi.org/10.1109/ACCESS.2023.3299174
Mendeley helps you to discover research relevant for your work.