Visual Exploration of Semantic Relationships in Neural Word Embeddings

作者: Shusen Liu , Peer-Timo Bremer , Jayaraman J. Thiagarajan , Vivek Srikumar , Bei Wang

DOI: 10.1109/TVCG.2017.2745141

关键词:

摘要: Constructing distributed representations for words through neural language models and using the resulting vector spaces analysis has become a crucial component of natural processing (NLP). However, despite their widespread application, little is known about structure properties these spaces. To gain insights into relationship between words, NLP community begun to adapt high-dimensional visualization techniques. In particular, researchers commonly use t-distributed stochastic neighbor embeddings (t-SNE) principal (PCA) create two-dimensional assessing overall exploring linear relationships (e.g., word analogies), respectively. Unfortunately, techniques often produce mediocre or even misleading results cannot address domain-specific challenges that are understanding semantic in embeddings. Here, we introduce new embedding visualizing syntactic analogies, corresponding tests determine whether views capture salient structures. Additionally, two novel comprehensive study analogy relationships. Finally, augment t-SNE convey uncertainty information order allow reliable interpretation. Combined, different number tasks difficult solve with existing tools.

参考文章(45)
Ken Brodlie, Rodolfo Allendes Osorio, Adriano Lopes, A Review of Uncertainty in Data Visualization Expanding the Frontiers of Visual Analytics and Visualization. pp. 81- 109 ,(2012) , 10.1007/978-1-4471-2804-5_6
Kristin Potter, Paul Rosen, Chris R. Johnson, From Quantification to Visualization: A Taxonomy of Uncertainty Visualization Approaches 10th IFIP WG 2.5 Working Conference on Uncertainty Quantification in Scientific Computing, WoCoUQ 2011. ,vol. 377, pp. 226- 249 ,(2012) , 10.1007/978-3-642-32677-6_15
Kilian Weinberger, Matt Kusner, Nicholas Kolkin, Yu Sun, From Word Embeddings To Document Distances international conference on machine learning. pp. 957- 966 ,(2015)
Kwanghee Jung, Yoshio Takane, Multidimensional Scaling I International Encyclopedia of the Social & Behavioral Sciences (Second Edition). pp. 34- 39 ,(2015) , 10.1016/B978-0-08-097086-8.42045-3
S. Liu, B. Wang, J. J. Thiagarajan, P.-T. Bremer, V. Pascucci, Visual Exploration of High-Dimensional Data through Subspace Analysis and Dynamic Projections Computer Graphics Forum. ,vol. 34, pp. 271- 280 ,(2015) , 10.1111/CGF.12639
J.A.K. Suykens, J. Vandewalle, Least Squares Support Vector Machine Classifiers Neural Processing Letters. ,vol. 9, pp. 293- 300 ,(1999) , 10.1023/A:1018628609742
David Jean Biau, Brigitte M. Jolles, Raphaël Porcher, P Value and the Theory of Hypothesis Testing: An Explanation for New Researchers Clinical Orthopaedics and Related Research®. ,vol. 468, pp. 885- 892 ,(2010) , 10.1007/S11999-009-1164-4
Warren S. Torgerson, Multidimensional scaling: I. Theory and method Psychometrika. ,vol. 17, pp. 401- 419 ,(1952) , 10.1007/BF02288916
John A. Lee, Michel Verleysen, Quality assessment of dimensionality reduction: Rank-based criteria Neurocomputing. ,vol. 72, pp. 1431- 1443 ,(2009) , 10.1016/J.NEUCOM.2008.12.017
Ehsan Elhamifar, Rene Vidal, Sparse subspace clustering computer vision and pattern recognition. pp. 2790- 2797 ,(2009) , 10.1109/CVPR.2009.5206547