Modifying Memories in Transformer Models

作者: Srinadh Bhojanapalli , Ankit Singh Rawat , Felix Yu , Sanjiv Kumar , Manzil Zaheer

DOI:

关键词:

摘要: … memorization and generalization of Transformers have been widely studied, it is not well known how to make transformers forget specific old facts and memorize … of a Transformer model …

参考文章(45)
Luke S. Zettlemoyer, Michael Collins, Learning to map sentences to logical form: structured classification with probabilistic categorial grammars uncertainty in artificial intelligence. pp. 658- 666 ,(2005)
Manaal Faruqui, Jesse Dodge, Sujay Kumar Jauhar, Chris Dyer, Eduard Hovy, Noah A. Smith, Retrofitting Word Vectors to Semantic Lexicons north american chapter of the association for computational linguistics. pp. 1606- 1615 ,(2015) , 10.3115/V1/N15-1184
Yukun Zhu, Ryan Kiros, Rich Zemel, Ruslan Salakhutdinov, Raquel Urtasun, Antonio Torralba, Sanja Fidler, Aligning Books and Movies: Towards Story-Like Visual Explanations by Watching Movies and Reading Books international conference on computer vision. pp. 19- 27 ,(2015) , 10.1109/ICCV.2015.11
Arthur Szlam, Sainbayar Sukhbaatar, Jason Weston, Rob Fergus, End-to-end memory networks neural information processing systems. ,vol. 28, pp. 2440- 2448 ,(2015)
Nanda Kambhatla, Combining lexical, syntactic, and semantic features with maximum entropy models for extracting relations meeting of the association for computational linguistics. pp. 22- ,(2004) , 10.3115/1219044.1219066
Dan Roth, Wen-tau Yih, Probabilistic reasoning for entity & relation recognition Proceedings of the 19th international conference on Computational linguistics -. pp. 1- 7 ,(2002) , 10.3115/1072228.1072379
John M. Zelle, Raymond J. Mooney, Learning to parse database queries using inductive logic programming national conference on artificial intelligence. pp. 1050- 1055 ,(1996)
Adam Kalai, Venkatesh Saligrama, Kai-Wei Chang, Tolga Bolukbasi, James Zou, Man is to computer programmer as woman is to homemaker? debiasing word embeddings neural information processing systems. ,vol. 29, pp. 4356- 4364 ,(2016)
James Kirkpatrick, Razvan Pascanu, Neil Rabinowitz, Joel Veness, Guillaume Desjardins, Andrei A Rusu, Kieran Milan, John Quan, Tiago Ramalho, Agnieszka Grabska-Barwinska, Demis Hassabis, Claudia Clopath, Dharshan Kumaran, Raia Hadsell, None, Overcoming catastrophic forgetting in neural networks Proceedings of the National Academy of Sciences of the United States of America. ,vol. 114, pp. 3521- 3526 ,(2017) , 10.1073/PNAS.1611835114
Luke Zettlemoyer, Omer Levy, Eunsol Choi, Minjoon Seo, Zero-Shot Relation Extraction via Reading Comprehension arXiv: Computation and Language. ,(2017)