Natural language undergoes significant transformation from the domain of specialized research to general news intended for wider consumption. This transition makes the information vulnerable to misinterpretation, misrepresentation, and incorrect attribution, all of which may be difficult to identify without adequate domain knowledge and may exist even in the presence of explicit citations. Moreover, newswire articles seldom provide a precise correspondence between a specific claim and its origin, making it harder to identify which claims, if any, reflect the original findings. For instance, an article stating “Flagellin shows therapeutic potential with H3N2, known as Aussie Flu.” contains two claims (“Flagellin.. H3N2,” and “H3N2, known as Aussie Flu”) that may be true or false independent of each other, and it is prima facie unclear which claims, if any, are supported by the cited research. We build a dataset of sentences from medical news along with the sources from peer-reviewed medical research journals they cite. We use these data to study what a general reader perceives to be true, and how to verify the scientific source of claims. Unlike existing datasets, this captures the metamorphosis of information across two genres with disparate readership and vastly different vocabularies and presents the first empirical study of health-related fact-checking across them.
CITATION STYLE
Zuo, C., Mathur, K., Kela, D., Salek Faramarzi, N., & Banerjee, R. (2022). Beyond belief: a cross-genre study on perception and validation of health information online. International Journal of Data Science and Analytics, 13(4), 299–314. https://doi.org/10.1007/s41060-022-00310-7
Mendeley helps you to discover research relevant for your work.