What Can We Do with Graph-Structured Data? – A Data Mining Perspective

Hiroshi Motoda

Book Chapter

What Can We Do with Graph-Structured Data? – A Data Mining Perspective

Motoda H

DOI: 10.1007/11941439_1

N/ACitations

12Readers

Get full text

Abstract

Recent advancement of data mining techniques has made it possible to mine from complex structured data. Since structure is represented by proper relations and a graph can easily represent relations, knowledge discovery from graph-structured data (graph mining) poses a general problem for mining from structured data. Some examples amenable to graph mining are finding functional components from their behavior, finding typical web browsing patterns, identifying typical substructures of chemical compounds, finding typical subsequences of DNA and discovering diagnostic rules from patient history records. These are based on finding some typicality from a vast amount of graph-structured data. What makes it typical depends on each domain and each task. Most often frequency which has a good property of anti-monotonicity is used to discover typical pat-terns. The problem of graph mining is that it faces with subgraph isomorphism which is known to be NP-complete. In this talk, I will introduce two contrasting approaches for extracting frequent subgraphs, one using heuristic search (GBI) and the other using complete search (AGM). Both uses canonical labelling to deal with subgraph isomorphism. GBI [6,4] employs a notion of chunking, which recursively chunks two adjoining nodes, thus generating fairly large subgraphs at an early stage of search. It does not use the anti-monotonicity of frequency. The recent improved version extends it to employ pseudo-chunking which is called chunkingless chunking, enabling to extract overlapping subgraphs [5]. It can im-pose two kinds of constraints to accelerate search, one to include one or more of the designated subgraphs and the other to exclude all of the designated sub-graphs. It has been extended to extract unordered trees from a graph data by placing a restriction on pseudo-chunking operations. GBI can further be used as a feature constructor in decision tree building [1]. AGM represents a graph by its adjacency matrix and employs an Apriori-like bottom up search algorithm using anti-monotonicity of frequency [2]. It can handle both connected and dis-connected graphs. It has been extended to handle a tree data and a sequential data by incorporating to each a different bias in joining operators [3]. It has also been extended to incorporate taxonomy in labels to extract generalized

Cite

CITATION STYLE

APA

Motoda, H. (2006). What Can We Do with Graph-Structured Data? – A Data Mining Perspective (pp. 1–2). https://doi.org/10.1007/11941439_1

What Can We Do with Graph-Structured Data? – A Data Mining Perspective

Abstract

Cite

Register to see more suggestions