Research on Tibetan News Sites’ Web Crawler and Search Engine

Zhiqiang Han; Guixian Xu; Wei Sun

Conference ProceedingsOPEN ACCESS

Research on Tibetan News Sites’ Web Crawler and Search Engine

Han Z
Xu G
Sun W

Proceedings of the International Conference on Logistics, Engineering, Management and Computer Science (2015) 117

DOI: 10.2991/lemcs-15.2015.116

N/ACitations

5Readers

Abstract

In this paper, researchers detailedly introduce the features of Tibetan language and related technologies that researchers use to deal with Tibetan news web pages with computers. To get the content of the Tibetan news, researchers used web crawler to download Tibetan news pages which were the bases of this project. Researchers used an open source web crawler named scrapy and rewrote the crawl part to make the crawler work more accurately and efficiently. To search the Tibetan content in a way, researchers defined and counted every statistical data that was useful and helpful to enhance the performance of the search engine. Researchers used solr, another open source software, as the user interface of this system. The crawler and search engine were combined by the web pages to provide the data retrieval service. Comparing with other works, researchers' work adopted a safe and stable enough framework to enhance the user experience in using Tibetan search engine. Researchers' work played a positive role in the spread of Tibetan culture and promoted the development of the Tibetan language news in the field of search engines.

Cite

CITATION STYLE

APA

Han, Z., Xu, G., & Sun, W. (2015). Research on Tibetan News Sites’ Web Crawler and Search Engine. In Proceedings of the International Conference on Logistics, Engineering, Management and Computer Science (Vol. 117). Atlantis Press. https://doi.org/10.2991/lemcs-15.2015.116

Research on Tibetan News Sites’ Web Crawler and Search Engine

Abstract

Cite

Register to see more suggestions