Abstract
While various approaches to domain adaptation exist, the majority of them requires knowledge of the target domain, and additional data, preferably labeled. For a language like English, it is often feasible to match most of those conditions, but in low-resource languages, it presents a problem. We explore the situation when neither data nor other information about the target domain is available. We use two samples of Danish, a low-resource language, from the consumer review domain (film vs. company reviews) in a sentiment analysis task. We observe dramatic performance drops when moving from one domain to the other. We then introduce a simple offline method that makes models more robust towards unseen domains, and observe relative improvements of more than 50%.
Cite
CITATION STYLE
Elming, J., Hovy, D., & Plank, B. (2014). Robust Cross-Domain Sentiment Analysis for Low-Resource Languages. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (pp. 2–7). Association for Computational Linguistics (ACL). https://doi.org/10.3115/v1/w14-2602
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.