Anomaly Detection in Text Data Sets using Character-Level Representation

3Citations
Citations of this article
8Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

This paper proposes a character-level representation of unsupervised text data sets for anomaly detection problems. An empirical examination of the character-level text representation was conducted to demonstrate the ability to separate outlying and normal records using an ensemble of multiple classic numerical anomaly classifiers. Experimental results obtained on two different data sets confirmed the applicability of the developed unsupervised model to detect outlying instances in various real-world scenarios, providing the opportunity to quickly assess a large amount of textual data in terms of information consistency and conformity without knowledge of the data content itself.

References Powered by Scopus

A Neural Probabilistic Language Model

5157Citations
N/AReaders
Get full text

Isolation forest

4703Citations
N/AReaders
Get full text

A statistical interpretation of term specificity and its application in retrieval

2969Citations
N/AReaders
Get full text

Cited by Powered by Scopus

Text Anomaly Detection Advancements, Challenges and Pathways: A Systematic Literature Review

0Citations
N/AReaders
Get full text

EAD: effortless anomalies detection, a deep learning based approach for detecting outliers in English textual data

0Citations
N/AReaders
Get full text

FakeRecogna Anomaly: Fake News Detection in a New Brazilian Corpus

0Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Mohaghegh, M., & Abdurakhmanov, A. (2021). Anomaly Detection in Text Data Sets using Character-Level Representation. In Journal of Physics: Conference Series (Vol. 1880). IOP Publishing Ltd. https://doi.org/10.1088/1742-6596/1880/1/012028

Readers over time

‘21‘22‘2302468

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 2

67%

Researcher 1

33%

Readers' Discipline

Tooltip

Computer Science 4

100%

Save time finding and organizing research with Mendeley

Sign up for free
0