Sketchsegnet: A Rnn Model for Labeling Sketch Strokes

作者: Xingyuan Wu , Yonggang Qi , Jun Liu , Jie Yang

DOI: 10.1109/MLSP.2018.8516988

关键词: SegmentationArtificial intelligenceSequenceSketchImage retrievalSketch recognitionPattern recognitionComputer scienceArtificial neural networkInterpretation (logic)

摘要: We investigate the problem of stroke-level sketch segmentation, which is to train machines assign strokes with semantic part labels given a input sketch. Solving segmentation opens door for fine-grained interpretation, can benefit many novel sketch-based applications, including recognition and image retrieval. In this paper, we treat as seqence-to-sequence generation problem, reccurent nueral networks (RNN)-based model SketchSegNet presented translate sequence into thier labels. addition, first time large-scale dataset proposed, composed 57K annotated free-hand human selected from QuickDraw. Experimental results on shows that our approach offers an average accuracy over 90% stroke labeling.

参考文章(18)
Yonggang Qi, Yi-Zhe Song, Honggang Zhang, Jun Liu, Sketch-based image retrieval via Siamese convolutional neural network international conference on image processing. pp. 2460- 2464 ,(2016) , 10.1109/ICIP.2016.7532801
Yejin Choi, Sketch-to-Text Generation: Toward Contextual, Creative, and Coherent Composition. international conference on natural language generation. pp. 40- 40 ,(2016) , 10.18653/V1/W16-6607
Douglas Eck, David Ha, A Neural Representation of Sketch Drawings international conference on learning representations. ,(2017)
Ravi Kiran Sarvadevabhatla, R. Venkatesh Babu, Isht Dwivedi, Sahil Manocha, Abhijat Biswas, SketchParse : Towards Rich Descriptions for Poorly Drawn Sketches using Multi-Task Hierarchical Deep Networks arXiv: Computer Vision and Pattern Recognition. ,(2017)
Jifei Song, Kaiyue Pang, Yi-Zhe Song, Tao Xiang, Timothy M. Hospedales, Learning to Sketch with Shortcut Cycle Consistency computer vision and pattern recognition. pp. 801- 810 ,(2018) , 10.1109/CVPR.2018.00090
Jürgen Schmidhuber, Alex Graves, Framewise phoneme classification with bidirectional LSTM and other neural network architectures international joint conference on neural network. ,vol. 18, pp. 602- 610 ,(2005)
A. Inselberg, B. Dimsdale, Parallel coordinates for visualizing multi-dimensional geometry CG International '87 on Computer graphics 1987. pp. 25- 44 ,(1987)
Diederik P. Kingma, Jimmy Ba, Adam: A Method for Stochastic Optimization arXiv: Learning. ,(2014)
Timothy Hospedales, Yi Li, Jun Guo, Yonggang Qi, Yi-Zhe Song, Tao Xiang, Honggang Zhang, Making better use of edges via perceptual grouping computer vision and pattern recognition. pp. 1856- 1865 ,(2015) , 10.1109/CVPR.2015.7298795
Max Welling, Diederik P Kingma, Auto-Encoding Variational Bayes international conference on learning representations. ,(2014)