The detection of violations of intellectual properties on multimedia files is a critical problem for the current infrastructure of the Internet, especially within very large document collections. To contrast such a problem, either proactive or reactive methods are used. The first category prevents the upload of infringing files themselves by comparing illegal files with a reference collection, while the second one responds to reports made by third parties or artificial intelligence systems in order to delete files deemed illegal. In this article we propose an approach that is both reactive and proactive at the same time, with the aim of preventing the deletion of legal uploads of files (or modifications of such files, such as remixes, parodies and other edits) due to the presence of illegal uploads on a platform. We developed a rule-based obfuscating focused crawler able to work with audio files in the Audio Information Retrieval (AIR) domain, but its use can be easily extended to other multimedia file types, such as videos or textual documents. Our proposed model automatically scans multimedia files uploaded to the public collection only when a user query is submitted to it. We will also show experimental results obtained during tests on a known musical collection. Several combinations of specific Neural Network-Similarity Scorer solutions are shown, and we will discuss the strength and efficiency of each combination.
CITATION STYLE
Montanaro, M., Rinaldi, A. M., Russo, C., & Tommasino, C. (2024). A rule-based obfuscating focused crawler in the audio retrieval domain. Multimedia Tools and Applications, 83(9), 25231–25260. https://doi.org/10.1007/s11042-023-16155-6
Mendeley helps you to discover research relevant for your work.