作者: Srinadh Bhojanapalli , Ankit Singh Rawat , Felix Yu , Sanjiv Kumar , Manzil Zaheer
DOI:
关键词:
摘要: … memorization and generalization of Transformers have been widely studied, it is not well known how to make transformers forget specific old facts and memorize … of a Transformer model …