Prior art retrieval using the claims section as a bag of words

Suzan Verberne; Eva D'Hondt

Conference Proceedings

Prior art retrieval using the claims section as a bag of words

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2010) 6241 LNCS 497-501

DOI: 10.1007/978-3-642-15754-7_60

5Citations

14Readers

Get full text

Abstract

In this paper we describe our participation in the 2009 CLEF-IP task, which was targeted at prior-art search for topic patent documents. We opted for a baseline approach to get a feeling for the specifics of the task and the documents used. Our system retrieved patent documents based on a standard bag-of-words approach for both the Main Task and the English Task. In both runs, we extracted the claim sections from all English patents in the corpus and saved them in the Lemur index format with the patent IDs as DOCIDs. These claims were then indexed using Lemur's BuildIndex function. In the topic documents we also focused exclusively on the claims sections. These were extracted and converted to queries by removing stopwords and punctuation. We did not perform any term selection or query expansion. We retrieved 100 patents per topic using Lemur's RetEval function, retrieval model TF-IDF. Compared to the other runs submitted to the track, we obtained good results in terms of nDCG (0.46) and moderate results in terms of MAP (0.054). © 2010 Springer-Verlag Berlin Heidelberg.

Cite

CITATION STYLE

APA

Verberne, S., & D’Hondt, E. (2010). Prior art retrieval using the claims section as a bag of words. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6241 LNCS, pp. 497–501). https://doi.org/10.1007/978-3-642-15754-7_60

Prior art retrieval using the claims section as a bag of words

Abstract

Cite

Register to see more suggestions