作者: Nicola Cancedda , Samidh Chatterjee
DOI:
关键词: Computer science 、 Word error rate 、 BLEU 、 Work (physics) 、 Algorithm 、 Sampling (statistics) 、 Machine translation 、 Lattice (module) 、 Translation (geometry) 、 Measure (mathematics)
摘要: Minimum Error Rate Training is the algorithm for log-linear model parameter training most used in state-of-the-art Statistical Machine Translation systems. In its original formulation, uses N-best lists output by decoder to grow Pool that shapes surface on which actual optimization performed. Recent work has been done extend use entire translation lattice built decoder, instead of lists. We propose here a third, intermediate way, consisting growing pool using samples randomly drawn from lattice. empirically measure systematic improvement BLEU scores compared lists, without suffering increase computational complexity associated with operating whole