Urdu Wikification and Its Application in Urdu News Recommendation System

3Citations
Citations of this article
22Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Wikification is the process of linking the entities found in a sample text to their individual Wikipedia or Wikidata pages. Many natural language processing applications, including question-answering systems, information retrieval, fraud detection, and recommendation systems(RS), can benefit from this information extraction technique. There has been a great deal of effort put towards entity-linking(EL) for both Asian and Western languages, with several datasets and numerous proposed methodologies. Despite millions of Urdu language users globally, relatively little entity-linking research has been done for Urdu. This work proposes an Urdu EL pipeline to identify named entities in text and link them to Wikidata. Secondly, a dataset of 550 Urdu news titles relating to their respective Wiki-ids has been prepared for the examination. Third, utilizing the proposed EL pipeline, 16738 news articles from the first-ever Urdu news RS dataset of 100 users are annotated. Fourthly, a sub Knowledge graph (KG) of 8439 entities and 23080 relationship tuples is retrieved from Wikidata. The Trans-E algorithm is then used to create KG embeddings so that the extracted KG may be used in an Urdu news RS. The final accuracy of Urdu news RS is 60.8%.

Cite

CITATION STYLE

APA

Kanwal, S., Malik, M. K., Nawaz, Z., & Mehmood, K. (2022). Urdu Wikification and Its Application in Urdu News Recommendation System. IEEE Access, 10, 103655–103668. https://doi.org/10.1109/ACCESS.2022.3208666

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free