Incorporating linguistic information to statistical word-level alignment

0Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.

Abstract

Parallel texts are enriched by alignment algorithms, thus establishing a relationship between the structures of the implied languages. Depending on the alignment level, the enrichment can be performed on paragraphs, sentences or words, of the expressed content in the source language and its translation. There are two main approaches to perform word-level alignment: statistical or linguistic. Due to the dissimilar grammar rules the languages have, the statistical algorithms usually give lower precision. That is why the development of this type of algorithms is generally aimed at a specific language pair using linguistic techniques. A hybrid alignment system based on the combination of the two traditional approaches is presented in this paper. It provides user-friendly configuration and is adaptable to the computational environment. The system uses linguistic resources and procedures such as identification of cognates, morphological information, syntactic trees, dictionaries, and semantic domains. We show that the system outperforms existing algorithms. © 2009 Springer-Verlag Berlin Heidelberg.

References Powered by Scopus

A systematic comparison of various statistical alignment models

2938Citations
N/AReaders
Get full text

A discriminative framework for bilingual word alignment

96Citations
N/AReaders
Get full text

Improved word alignment with statistics and linguistic heuristics

25Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Cendejas, E., Barceló, G., Gelbukh, A., & Sidorov, G. (2009). Incorporating linguistic information to statistical word-level alignment. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5856 LNCS, pp. 387–394). https://doi.org/10.1007/978-3-642-10268-4_46

Readers' Seniority

Tooltip

Professor / Associate Prof. 2

33%

PhD / Post grad / Masters / Doc 2

33%

Researcher 2

33%

Readers' Discipline

Tooltip

Computer Science 4

67%

Philosophy 1

17%

Linguistics 1

17%

Save time finding and organizing research with Mendeley

Sign up for free