With more than a billion web sites, volume and variety of content available for consumption is huge. However, credibility, an important quality characteristic of web pages is questionable in many cases and tends to be non-uniform. Credibility can increase or reduce the importance of web page leading to potential gain or loss of user base. Credibility without factoring genre of content (for example, Help, Article, Discussion, etc.) can lead to incorrect assessment. Depending on the genre, the importance of features such as web page date time modified, grammar, image to text ratio, in and out links, and other web page features differ. We propose a genre credibility assessment based on web page surface features and their importance in a genre. Further, we built a WEBCred framework to assess GCS (Genre based Credibility Score) with flexibility to add/modify genres, its features and their importance. We validated our approach on 10,429 ‘Information Security’ related web pages; the assessed score correlated 35% with crowd sourced Web Of Trust (WOT) score and 39% with Alexa ranking.
CITATION STYLE
Agrawal, S., Mohan, S. L., & Reddy, Y. R. (2018). Automated credibility assessment of web page based on genre. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11297 LNCS, pp. 155–169). Springer Verlag. https://doi.org/10.1007/978-3-030-04780-1_11
Mendeley helps you to discover research relevant for your work.