Abstract
Recently, source code mining has received increasing attention due to the rapid increase of open-sourced code repositories and the tremendous values implied in this large dataset, which can help us understand the organization of functions or classes in different software and analyze the impact of these organized patterns on the software behaviors. Hence, learning an effective representation model for the functions of source code, from a modern view, is a crucial problem. Considering the inherent hierarchy of functions, we propose a novel hyperbolic function embedding (HFE) method, which can learn a distributed and hierarchical representation for each function via the Poincaré ball model. To achieve this, a function call graph (FCG) is first constructed to model the call relationship among functions. To verify the underlying geometry of FCG, the Ricci curvature model is used. Finally, an HFE model is built to learn the representations that can capture the latent hierarchy of functions in the hyperbolic space, instead of the Euclidean space, which are usually used in those state-of-the-art methods. Moreover, HFE is more compact in terms of lower dimensionality than the existing graph embedding methods. Thus, HFE is more effective in terms of computation and storage. To experimentally evaluate the performance of HFE, two application scenarios, namely, function classification and link prediction, have been applied. HFE achieves up to 7.6% performance improvement compared to the chosen state-of-the-art methods, namely, Node2vec and Struc2vec.
Author supplied keywords
Cite
CITATION STYLE
Lu, M., Liu, Y., Li, H., Tan, D., He, X., Bi, W., & Li, W. (2019). Hyperbolic function embedding: Learning hierarchical representation for functions of source code in hyperbolic space. Symmetry, 11(2). https://doi.org/10.3390/sym11020254
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.