Research on Tibetan News Sites’ Web Crawler and Search Engine

  • Han Z
  • Xu G
  • Sun W
N/ACitations
Citations of this article
5Readers
Mendeley users who have this article in their library.

Abstract

In this paper, researchers detailedly introduce the features of Tibetan language and related technologies that researchers use to deal with Tibetan news web pages with computers. To get the content of the Tibetan news, researchers used web crawler to download Tibetan news pages which were the bases of this project. Researchers used an open source web crawler named scrapy and rewrote the crawl part to make the crawler work more accurately and efficiently. To search the Tibetan content in a way, researchers defined and counted every statistical data that was useful and helpful to enhance the performance of the search engine. Researchers used solr, another open source software, as the user interface of this system. The crawler and search engine were combined by the web pages to provide the data retrieval service. Comparing with other works, researchers' work adopted a safe and stable enough framework to enhance the user experience in using Tibetan search engine. Researchers' work played a positive role in the spread of Tibetan culture and promoted the development of the Tibetan language news in the field of search engines.

Cite

CITATION STYLE

APA

Han, Z., Xu, G., & Sun, W. (2015). Research on Tibetan News Sites’ Web Crawler and Search Engine. In Proceedings of the International Conference on Logistics, Engineering, Management and Computer Science (Vol. 117). Atlantis Press. https://doi.org/10.2991/lemcs-15.2015.116

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free