作者: Pieter Abbeel , Ali Punjani , Karthik Narayan
DOI:
关键词:
摘要: Although recent work in non-linear dimensionality reduction investigates multiple choices of divergence measure during optimization (Yang et al., 2013; Bunte 2012), little discusses the direct effects that measures have on visualization. We study this relationship, theoretically and through an empirical analysis over 10 datasets. Our works shows how α β parameters generalized alpha-beta can be chosen to discover hidden macrostructures (categories, e.g. birds) or microstructures (fine-grained classes, toucans). method, which generalizes t-SNE (van der Maaten, 2008), allows us such structure without extensive grid searches (α, β) due our theoretical analysis: is apparent with particular generalize across also discuss efficient parallel CPU GPU schemes are non-trivial tree-structures employed large datasets do not fully fit into memory. method runs 20x faster than fastest published code (Vladymyrov & Carreira-Perpinan, 2014). conclude detailed case studies following very datasets: ILSVRC 2012, a standard computer vision dataset 1.2M images; SUSY, particle physics 5M instances; HIGGS, another 11M instances. This represents largest visualization attained by SNE methods. open-sourced code: http://rll.berkeley.edu/absne/.