Bandit Structured Prediction for Learning from Partial Feedback in Statistical Machine Translation

作者: Tanguy Urvoy , Stefan Riezler , Artem Sokolov

DOI:

关键词:

摘要: We present an approach to structured prediction from bandit feedback, called Bandit Structured Prediction, where only the value of a task loss function at single predicted point, instead correct structure, is observed in learning. application discriminative reranking Statistical Machine Translation (SMT) learning algorithm has access 1-BLEU evaluation translation obtaining gold standard reference translation. In our experiment feedback obtained by evaluating BLEU on translations without revealing them algorithm. This can be thought as simulation interactive machine SMT system personalized user who provides point translations. Our experiments show that improves quality and comparable approaches employ more informative

参考文章(47)
Alexander Rakhlin, Jacob D. Abernethy, An Efficient Bandit Algorithm for sqrt(T) Regret in Online Multiclass Prediction conference on learning theory. ,(2009)
Christopher D. Manning, Michael Collins, Daphne Koller, Ben Taskar, Dan Klein, Max-Margin Parsing empirical methods in natural language processing. pp. 1- 8 ,(2004)
Ofer Dekel, Alekh Agarwal, Lin Xiao, Optimal Algorithms for Online Convex Optimization with Multi-Point Bandit Feedback. conference on learning theory. pp. 28- 40 ,(2010)
Robert E. Schapire, Wei Chu, Lihong Li, Lev Reyzin, Contextual bandits with linear Payoff functions international conference on artificial intelligence and statistics. ,vol. 15, pp. 208- 214 ,(2011)
Boris T Poljak, Introduction to optimization Optimization Software, Publications Division. ,(1987)
John N. Tsitsiklis, Dimitri P. Bertsekas, Neuro-dynamic programming ,(1996)
Hal Daume, Kai-Wei Chang, Akshay Krishnamurthy, Alekh Agarwal, John Langford, Learning to Search Better than Your Teacher international conference on machine learning. pp. 2058- 2066 ,(2015)
Olivier Chapelle, Eren Manavoglu, Romer Rosales, Simple and Scalable Response Prediction for Display Advertising ACM Transactions on Intelligent Systems and Technology. ,vol. 5, pp. 61- ,(2014) , 10.1145/2532128