URL Classification Using Convolutional Neural Network for a New Large Dataset

2Citations
Citations of this article
1Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In today’s world, methods for real-time web page classification are in need due to the tremendous increase in the number of web pages and Internet usage of the people. To address these problems, in the literature, URL-based methods have been proposed which have advantages in classification speed and computational effectiveness over content-based approaches. This work proposes a CNN-based method using URLs only as input. We extract word-level tokens from the URLs alone, feed them into a word embedding layer and then hyper-tunned CNN layers. Our experiments demonstrate that this method can archive an F1-score of 0.9759 and outperforms many existing methods for a new large dataset.

Cite

CITATION STYLE

APA

Hung, P. D., Hung, N. D., & Diep, V. T. (2022). URL Classification Using Convolutional Neural Network for a New Large Dataset. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 13492 LNCS, pp. 103–114). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-16538-2_11

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free