Learning and mining technologies have been broadly applied to reveal the value of tremendous data and impact decision-making. Usually, the correctness of decisions roots in the truth of data for these technologies. Data fraud presents everywhere, and even if data were true, could data be maliciously manipulated by cyber-attackers. Methods have been long exploited to examine data authenticity, but are less effective when only values are manipulated without violating scopes and definitions. Then the decisions made from fraud and manipulated data are wrong or hijacked. It has been concluded that data manipulation is the latest technique in “the art of war in cyberspace.” Examining each data instance from its source is exhaustive and impossible, for example recollecting data for national consensus. In this paper, through a case study on the data of banknotes, we exploit Topological Data Analysis (TDA) for examining manipulated data. A fraction of data records are examined integrally other than individually. The possibility of using TDA to verify data efficiently is then evaluated. We first test the possibility of using TDA for the above detection, and then discuss the limitations of the state of the art. Although TDA is not so matured, it has been reported to be effective in many applications, and now our work evidences its usage for data anomalies.
CITATION STYLE
Guo, Y., Sun, D., Li, G., & Chen, S. (2018). Examine manipulated datasets with topology data analysis: A case study. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11149 LNCS, pp. 358–373). Springer Verlag. https://doi.org/10.1007/978-3-030-01950-1_21
Mendeley helps you to discover research relevant for your work.