作者: Lucia Specia , Chris Callison-Burch , Christof Monz , Matt Post , Radu Soricut
DOI:
关键词:
摘要: This paper presents the results of WMT12 shared tasks, which included a translation task, task for machine evaluation metrics, and run-time estimation quality. We conducted large-scale manual 103 systems submitted by 34 teams. used ranking these to measure how strongly automatic metrics correlate with human judgments quality 12 metrics. introduced new this year, evaluated submissions from 11