Abu-MaTran at WMT 2015 Translation Task: Morphological Segmentation and Web Crawling
|Authors:||Raphael Rubino; Tommi Pirinen; Miquel Esplà-Gomis; Nikola Ljubešić; Sergio Ortiz Rojas; Vassilis Papavassiliou; Prokopis Prokopidis; Antonio Toral|
|Book title:||Proceedings of the Tenth Workshop on Statistical Machine Translation|
This paper presents the machine translation systems submitted by the Abu-MaTran project for the Finnish–English language pair at the WMT 2015 translation task. We tackle the lack of resources and complex morphology of the Finnish language by (i) crawling parallel and monolingual data from the Web and (ii) applying rule-based and unsupervised methods for morphological segmentation. Several statistical machine translation approaches are evaluated and then combined to obtain our final submissions, which are the top performing English-to-Finnish unconstrained (all automatic metrics) and constrained (BLEU), and Finnish-to-English constrained (TER) systems.