作者: Tanguy Urvoy , Stefan Riezler , Artem Sokolov
DOI:
关键词:
摘要: We present an approach to structured prediction from bandit feedback, called Bandit Structured Prediction, where only the value of a task loss function at single predicted point, instead correct structure, is observed in learning. application discriminative reranking Statistical Machine Translation (SMT) learning algorithm has access 1-BLEU evaluation translation obtaining gold standard reference translation. In our experiment feedback obtained by evaluating BLEU on translations without revealing them algorithm. This can be thought as simulation interactive machine SMT system personalized user who provides point translations. Our experiments show that improves quality and comparable approaches employ more informative