Δημοσίευση - Domain Adaptation of Statistical Machine Translation using Web-Crawled Resources: A Case Study
ΕΡΕΥΝΑ

Domain Adaptation of Statistical Machine Translation using Web-Crawled Resources: A Case Study

Ερευνητική περιοχή:  
    
Είδος:  
Άρθρο σε πρακτικά

 

Έτος: 2012
Συγγραφείς: Pavel Pecina; Antonio Toral; Βασίλης Παπαβασιλείου; Προκόπης Προκοπίδης; Josef van Genabith
Τίτλος βιβλίου: Proceedings of the 16th Annual Conference of the European Association for Machine Translation
Διεύθυνση: Trento, Italy
Περίληψη:
We tackle the problem of domain adaptation of Statistical Machine Translation by exploiting domain-specific data acquired by domain-focused web-crawling. We design and evaluate a procedure for automatic acquisition of monolingual and parallel data and their exploitation for training, tuning, and testing in a phrase-based Statistical Machine Translation system. We present a strategy for using such resources depending on their availability and quantity supported by results of a large-scale evaluation on the domains of Natural Environment and Labour Legislation and two language pairs: English--French, English--Greek. The average observed increase of BLEU is substantial at 49.5% relative.
[Bibtex]