A tree kernel based on classification and citation data to analyse patent documents

0Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

We consider the problem of representing patent documents in such a way that a kernel matrix reflecting the similarities of the documents can be efficiently computed. The European classification system ECLA is a deep level hierarchical taxonomy comprising about 130,000 classification symbols. Depending on their technical content, patent documents are assigned one or more ECLA classification symbols. In this study we represent the complete ECLA taxonomy as a tree labelled by the classification symbols, called the ECLA tree. Within the ECLA tree a positive value is attached to each node of the tree reflecting the technical specificity of the corresponding classification symbol. Based on the directly assigned symbols as well as on symbols of the cited and citing documents, patent documents are mapped to subtrees of the ECLA tree. Taking into account the specificity of the tree nodes, we define an inner product on subtrees representing the documents. It is shown that the inner product is a valid kernel function which can be effectively used for discovering clusters in a set of patent documents. © Springer-Verlag Berlin Heidelberg 2010.

Cite

CITATION STYLE

APA

Arndt, M., & Arndt, U. (2010). A tree kernel based on classification and citation data to analyse patent documents. In Studies in Classification, Data Analysis, and Knowledge Organization (pp. 571–578). Kluwer Academic Publishers. https://doi.org/10.1007/978-3-642-10745-0_62

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free