Computational historical linguistics and language diversity in South Asia

Aryaman Arora; Adam Farris; Samopriya Basu; Suresh Kolichala

Conference ProceedingsOPEN ACCESS

Computational historical linguistics and language diversity in South Asia

Proceedings of the Annual Meeting of the Association for Computational Linguistics (2022) 1 1396-1409

DOI: 10.18653/v1/2022.acl-long.99

8Citations

45Readers

Abstract

South Asia is home to a plethora of languages, many of which severely lack access to new language technologies. This linguistic diversity also results in a research environment conducive to the study of comparative, contact, and historical linguistics-fields which necessitate the gathering of extensive data from many languages. We claim that data scatteredness (rather than scarcity) is the primary obstacle in the development of South Asian language technology, and suggest that the study of language history is uniquely aligned with surmounting this obstacle. We review recent developments in and at the intersection of South Asian NLP and historical-comparative linguistics, describing our and others' current efforts in this area. We also offer new strategies towards breaking the data barrier.

Cite

CITATION STYLE

APA

Arora, A., Farris, A., Basu, S., & Kolichala, S. (2022). Computational historical linguistics and language diversity in South Asia. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (Vol. 1, pp. 1396–1409). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2022.acl-long.99

Computational historical linguistics and language diversity in South Asia

Abstract

Cite

Register to see more suggestions