Induction of rules from parallel corpus
Date
2008Author
Zaremba, Mindaugas
Laukaitis, Algirdas
Vasilecas, Olegas
Metadata
Show full item recordAbstract
This paper considers approaches for translation between English and morphology-rich languages. We consider all Web-available linguistic resources for this task and integrate them in one comprehensive statistical model. Syntax parsers, bilingual and semantic dictionaries, bilingual parallel corpus and monolingual Web-based corpus are taken into account. Multi-abstraction language representation is used for statistical induction of syntactic and semantic transformation rules called multi-alignment templates. The decoding model is described using the future functions and a log-linear modeling approach. An evaluation of this approach is performed on the Lithuanian-English language pair. Presented experimental results demonstrates that the multiabstraction approach and hybridization of learning methods can improve translation quality. All resources presented in this paper are available at www.vvam.lt.