We investigate a method that assigns National Diet Library Subject Headings (NDLSH) to the results of web people searches to help users select and understand people on the web. NDLSH is a controlled subject vocabulary list compiled and maintained by the National Diet Library (NDL) as a subject access tool. By assigning NDLSH headings to people, well-formed keywords can be assigned, and exploratory searches using related terms are possible. We examined the following combination of factors: (a) web-page rank (the number of pages), (b) position inside the HTML, (c) synonyms, and (d) document frequency. We report our experimental results for 405 combination patterns using our 80-person dataset. Overall, under our experimental settings, the best combination was (a) the top ten pages, (b) 100 characters before and after a person’s name (i.e., 200 characters), (c) half weight for synonyms, and (d) document frequency divided by number of web pages.
CITATION STYLE
Shimokura, M., & Murakami, H. (2018). Assigning NDLSH Headings to People on the Web. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11292 LNCS, pp. 189–195). Springer Verlag. https://doi.org/10.1007/978-3-030-03520-4_18
Mendeley helps you to discover research relevant for your work.