In this article we describe the creation and distribution of the first publicly available word embeddings for Portuguese. Our embeddings are evaluated on their own and also compared with the original English models on a well-known analogy task. We gathered a large Portuguese corpus of 1.7 billion tokens, developed the first distributional semantic analogies test set for Portuguese, and proceeded with the first parametrization and evaluation of Portuguese word embeddings models.
CITATION STYLE
Rodrigues, J., Branco, A., Neale, S., & Silva, J. (2016). LX-DSemvectors: Distributional semantics models for Portuguese. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9727, pp. 259–270). Springer Verlag. https://doi.org/10.1007/978-3-319-41552-9_27
Mendeley helps you to discover research relevant for your work.