Abstract
Transferring knowledge from a label-rich domain (source domain) to a label-scarce domain (target domain) for pervasive cross-domain Text Classification (TC) is a non-trivial task. To overcome this issue, we propose EADA, a novel unsupervised energy-based adversarial domain adaptation framework. First, a deep pre-trained language model (e.g. RoBERTa) is leveraged as a shared feature extractor that maps the text sequences from both source and target domains to a feature space. Since the source features maintain good feature discriminability because of the full supervised training, we design a method that encourages target features towards the source ones via adversarial learning. An autoencoder is designed as an energy function that focuses on reconstructing source feature embeddings, while the feature extractor aims to generate source-like target feature embeddings to deceive the autoencoder. In this manner, the target feature embeddings become domain-invariant and inherit great discriminability. Extensive experiments on multi-domain sentiment classification (Amazon review dataset) and Yes/No question-answering classification (BoolQ and MARCO dataset) are conducted. The experimental results validate that EADA largely alleviates the domain discrepancy while maintaining excellent discriminability and achieves state-of-the-art cross-domain TC performance.
Cite
CITATION STYLE
Zou, H., Yang, J., & Wu, X. (2021). Unsupervised Energy-based Adversarial Domain Adaptation for Cross-domain Text Classification. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 (pp. 1208–1218). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.findings-acl.103
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.