Transforming neural network visual representations to predict human judgments of similarity

作者: Michael C. Mozer , Brett D. Roads , Maria Attarian

DOI:

关键词:

摘要: Deep-learning vision models have shown intriguing similarities and differences with respect to human vision. We investigate how bring machine visual representations into better alignment representations. Human are often inferred from behavioral evidence such as the selection of an image most similar a query image. find that appropriate linear transformations deep embeddings, we can improve prediction binary choice on data set bird images 72% at baseline 89%. hypothesized embeddings redundant, high (4096) dimensional representations; however, reducing rank these results in loss explanatory power. dilation transformation explored past research is too restrictive, indeed found model power be significantly improved more expressive transform. Most surprising exciting, that, consistent classic psychological literature, similarity judgments asymmetric: X Y not necessarily equal X, allowing express this asymmetry improves

参考文章(26)
Warren Stanley TORGERSON, Theory and Methods of Scaling ,(1958)
Jeffrey Heer, Cagatay Demiralp Demiralp, Michael S. Bernstein, Learning Perceptual Kernels for Visualization Design. IEEE Transactions on Visualization and Computer Graphics. ,vol. 20, pp. 1933- 1942 ,(2014) , 10.1109/TVCG.2014.2346978
Amos Tversky, None, Features of Similarity Psychological Review. ,vol. 84, pp. 327- 352 ,(1977) , 10.1037/0033-295X.84.4.327
Rachit Dubey, Joshua Peterson, Aditya Khosla, Ming-Hsuan Yang, Bernard Ghanem, What Makes an Object Memorable international conference on computer vision. pp. 1089- 1097 ,(2015) , 10.1109/ICCV.2015.130
Michael Wilber, Iljung S Kwak, David Kriegman, Serge Belongie, None, Learning Concept Embeddings with Combined Human-Machine Expertise international conference on computer vision. pp. 981- 989 ,(2015) , 10.1109/ICCV.2015.118
Linjie Li, Amanda Song, Vicente Malave, Garrison Cottrell, Angela Yu, Extracting Human Face Similarity Judgments: Pairs or Triplets? Journal of Vision. ,vol. 16, pp. 719- 719 ,(2016) , 10.1167/16.12.719
Cheng-Kang Hsieh, Longqi Yang, Yin Cui, Tsung-Yi Lin, Serge Belongie, Deborah Estrin, Collaborative Metric Learning the web conference. pp. 193- 201 ,(2017) , 10.1145/3038912.3052639
Joshua C. Peterson, Joshua T. Abbott, Thomas L. Griffiths, Evaluating (and Improving) the Correspondence Between Deep Neural Networks and Human Representations. Cognitive Science. ,vol. 42, pp. 2648- 2669 ,(2018) , 10.1111/COGS.12670
Nicolas Papernot, Ian Goodfellow, Brian Cheung, Gamaleldin Fathy Elsayed, Jascha Sohl-dickstein, Alex Kurakin, Shreya Shankar, Adversarial Examples that Fool both Computer Vision and Time-Limited Humans neural information processing systems. ,vol. 31, pp. 3910- 3920 ,(2018)
Matthias Bethge, Felix A. Wichmann, Wieland Brendel, Robert Geirhos, Claudio Michaelis, Patricia Rubisch, ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness international conference on learning representations. ,(2018)