UniT: Multimodal Multitask Learning with a Unified Transformer.

作者: Ronghang Hu , Amanpreet Singh

DOI:

关键词:

摘要: Abstract We propose UniT, a Unified Transformer model to simultaneously learn the most prominent tasks across different domains, ranging from object detection to natural language …

参考文章(59)
Priyanka Agrawal, Subhojeet Pramanik, Aman Hussain, OmniNet: A unified architecture for multi-modal multi-task learning arXiv: Learning. ,(2019)
Pengfei Liu, Xipeng Qiu, Xuanjing Huang, Adversarial Multi-task Learning for Text Classification meeting of the association for computational linguistics. ,vol. 1, pp. 1- 10 ,(2017) , 10.18653/V1/P17-1001
Xiaolong Wang, Ross Girshick, Abhinav Gupta, Kaiming He, Non-local Neural Networks 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 7794- 7803 ,(2018) , 10.1109/CVPR.2018.00813
Kaiming He, Georgia Gkioxari, Piotr Dollár, Ross Girshick, None, Mask R-CNN 2017 IEEE International Conference on Computer Vision (ICCV). pp. 2980- 2988 ,(2017) , 10.1109/ICCV.2017.322
Victor Sanh, Thomas Wolf, Sebastian Ruder, A Hierarchical Multi-Task Approach for Learning Embeddings from Semantic Tasks national conference on artificial intelligence. ,vol. 33, pp. 6949- 6956 ,(2019) , 10.1609/AAAI.V33I01.33016949
Samuel R. Bowman, Felix Hill, Omer Levy, Julian Michael, Alex Wang, Amanpreet Singh, GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding international conference on learning representations. ,(2018)
Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova, BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding north american chapter of the association for computational linguistics. pp. 4171- 4186 ,(2018) , 10.18653/V1/N19-1423
Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, Illia Polosukhin, None, Attention is All You Need neural information processing systems. ,vol. 30, pp. 5998- 6008 ,(2017)
Pranav Rajpurkar, Jian Zhang, Konstantin Lopyrev, Percy Liang, SQuAD: 100,000+ Questions for Machine Comprehension of Text empirical methods in natural language processing. pp. 2383- 2392 ,(2016) , 10.18653/V1/D16-1264
Vladlen Koltun, Fisher Yu, Multi-Scale Context Aggregation by Dilated Convolutions international conference on learning representations. ,(2016)