Keyphrase generation aims to generate a set of condensed phrases given a source document. Although maximum likelihood estimation (MLE) based keyphrase generation methods have shown impressive performance, they suffer from the bias on the source-prediction pair and the bias on the prediction-target pair. To tackle the above biases, we propose a novel correction model CorrKG on top of the MLE pipeline, where the biases are corrected via the optimal transport (OT) and a frequency-based filtering-and-sorting (FreqFS) strategy. Specifically, OT is introduced as the soft correction to facilitate the alignment of salient information and rectify the semantic bias on the source document and predicted keyphrases pair. An adaptive semantic mass learning scheme is conducted on the vanilla OT to achieve a proper pair-wise optimal transport procedure, which promotes the OT calculation brought by rectifying semantic masses dynamically. Besides, the FreqFS strategy is designed as the hard correction to reduce the bias of predicted and target keyphrases, and thus generate accurate and sufficient keyphrases. Extensive experiments over multiple benchmark datasets show that our model achieves superior keyphrase generation as compared with the state-of-the-arts.
CITATION STYLE
Zhao, G., Yin, G., Yang, P., & Yao, Y. (2022). Keyphrase Generation via Soft and Hard Semantic Corrections. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022 (pp. 7757–7768). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2022.emnlp-main.529
Mendeley helps you to discover research relevant for your work.