Bilingual documentation has become a common phenomenon in many official institutions and private companies. In this scenario, the categorization of bilingual text is a useful tool, that can be also applied in the machine translation field. To tackle this classification task, different approaches will be proposed. On the one hand, two finite-state transducer algorithms from the grammatical inference domain will be discussed. On the other hand, the well-known naive Bayes approximation will be presented along with a possible modelization based on n-gram language models. Experiments carried out on a bilingual corpus have demonstrated the adequacy of these methods and the relevance of a second information source in text classification, as supported by classification error rates. Relative reduction of 29% with respect to the best previous results on the monolingual version of the same task has been obtained. © Springer-Verlag Berlin Heidelberg 2005.
CITATION STYLE
Civera, J., Cubel, E., Juan, A., & Vidal, E. (2005). Different approaches to bilingual text classification based on grammatical inference techniques. In Lecture Notes in Computer Science (Vol. 3523, pp. 630–637). Springer Verlag. https://doi.org/10.1007/11492542_77
Mendeley helps you to discover research relevant for your work.