An inclusive society needs to facilitate access to information for all of its members, including citizens with low literacy and with non-native language skills. We present an approach to assess Dutch text complexity on the sentence level and conduct an interpretability analysis to explore the link between neural models and linguistic complexity features.1 Building on these findings, we develop the first contextual lexical simplification model for Dutch and publish a pilot dataset for evaluation. We go beyond previous work which primarily targeted lexical substitution and propose strategies for adjusting the model’s linguistic register to generate simpler candidates. Our results indicate that continual pre-training and multi-task learning with conceptually related tasks are promising directions for ensuring the simplicity of the generated substitutions. Our code repository and the simplification dataset are available on GitHub.
CITATION STYLE
Hobo, E., Pouw, C., & Beinborn, L. (2023). “Geen makkie”: Interpretable Classification and Simplification of Dutch Text Complexity. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (pp. 503–517). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2023.bea-1.42
Mendeley helps you to discover research relevant for your work.