Classifying Illegal Activities on Tor Network using Hybrid Technique

2Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.

Abstract

With the freedom offered by the Deep Web, people have the opportunity to express themselves freely and discretely, and sadly, this is one of the reasons why people carry out illicit activities there. In this work, a novel dataset for Dark Web active domains known as crawler-DB is presented. To build the crawler-DB, the Onion Routing Network (Tor) was sampled, and then a web crawler capable of crawling into links was built. The link addresses that are gathered by the crawler are then classified automatically into five classes. The algorithm built in this study demonstrated good performance as it achieved an accuracy of 85%. A popular text representation method was used with the proposed crawler-DB crossed by two different supervised classifiers to facilitate the categorization of the Tor concealed services. The results of the experiments conducted in this study show that using the Term Frequency-Inverse Document Frequency (TF-IDF) word representation with a linear support vector classifier achieves 91% of 5 folds cross-validation accuracy when classifying a subset of illegal activities from crawler-DB, while the accuracy of Naïve Bayes was 80.6%. The good performance of the linear SVC might support potential tools to help the authorities in the detection of these activities. Moreover, outcomes are expected to be significant in both practical and theoretical aspects, and they may pave the way for further research.

Cite

CITATION STYLE

APA

Alshammery, M. K., & Aljuboori, A. F. (2022). Classifying Illegal Activities on Tor Network using Hybrid Technique. Iraqi Journal of Science, 63(9), 3994–4004. https://doi.org/10.24996/ijs.2022.63.9.30

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free