Using parallel text for the extraction of German multiword expressions

  • Fritzinger F
N/ACitations
Citations of this article
6Readers
Mendeley users who have this article in their library.

Abstract

A procedure for the identification of semantically opaque \(i.e. idiomatic\) German multiwords is presented. We focus on verb + PP combinations that are lexicographically relevant \(extracted via dependency parsing [Schiehlen 2003]\) of the kind ins Leben rufen – “to initiate”, lit.: “to call into life”. Starting from [Villada Moirón and Tiedemann 2006], the method exploits the fact that opaque combinations are translated as a whole, whereas compositional uses would show regular, individual translations of the words involved. The translations into other languages are obtained by applying GIZA++ [Och and Ney 2003] word alignment to the EUROPARL corpus [Koehn 2005]. Numerous experiments are performed to further optimise the original method: several parameters are analysed individually as well as in combination with each other. This leads to the following results: depending on the actual parameter settings, values between 0.800 and 0.936 \(in terms of uninterpolated average precision\) are reached amongst the highest scoring 200 multiword candidates, as opposed to a baseline of 0.584, using the 200 most frequent multiwords in decreasing order of their occurrence frequency.

Cite

CITATION STYLE

APA

Fritzinger, F. (2010). Using parallel text for the extraction of German multiword expressions. Lexis, (4). https://doi.org/10.4000/lexis.564

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free