Extremely large volumes of documents are available from online news platforms and social media. While the quantity of these documents have grown exponentially, the majority lack their quality, which can cause digital fatigue or promote misinformation. To this end, we propose a novel framework that can evaluate the quality of documents in terms of consistency. We model low-quality document detection as a binary classification task, which is able to measure how the documents have consistent contents. Specifically, we relax the problem by considering each sentence or paragraph as node. A given document is then considered as a network of nodes. We show how we define the supernode in a network and show how it is informative enough to detect whether the document is consistent or not. We believe this scheme can be applied to various applications including fake news detection, and document screening with qualitative evaluations. We achieve the state-of-the-art on existing tasks using the NELA17 dataset, and YH-News dataset which we release in this paper.
CITATION STYLE
Jung, D., Kim, M., & Cho, Y. S. (2022). Detecting Documents With Inconsistent Context. IEEE Access, 10, 98970–98980. https://doi.org/10.1109/ACCESS.2022.3204151
Mendeley helps you to discover research relevant for your work.