Jupyter notebook allows data scientists to write machine learning code together with its documentation in cells. In this paper, we propose a new task of code documentation generation (CDG) for computational notebooks. In contrast to the previous CDG tasks which focus on generating documentation for single code snippets, in a computational notebook, one documentation in a markdown cell often corresponds to multiple code cells, and these code cells have an inherent structure. We proposed a new model (HAConvGNN) that uses a hierarchical attention mechanism to consider the relevant code cells and the relevant code tokens information when generating the documentation. Tested on a new corpus constructed from well-documented Kaggle notebooks, we show that our model outperforms other baseline models.
CITATION STYLE
Liu, X., Wang, D., Wang, A. Y., Hou, Y., & Wu, L. (2021). HAConvGNN: Hierarchical Attention Based Convolutional Graph Neural Network for Code Documentation Generation in Jupyter Notebooks. In Findings of the Association for Computational Linguistics, Findings of ACL: EMNLP 2021 (pp. 4473–4485). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.findings-emnlp.381
Mendeley helps you to discover research relevant for your work.