DLI-IT: a deep learning approach to drug label identification through image and text embedding

作者: Xiangwen Liu , Joe Meehan , Weida Tong , Leihong Wu , Xiaowei Xu

DOI: 10.1186/S12911-020-1078-3

关键词: Semantic similarityRich Text FormatStandard test imageCosine similarityDeep learningArtificial neural networkArtificial intelligencePattern recognitionComputer scienceTesseractIdentification (information)

摘要: Drug label, or packaging insert play a significant role in all the operations from production through drug distribution channels to end consumer. Image of label also called Display Panel could be used identify illegal, illicit, unapproved and potentially dangerous drugs. Due time-consuming process high labor cost investigation, an artificial intelligence-based deep learning model is necessary for fast accurate identification In addition image-based technology, we take advantages rich text information on pharmaceutical package images. this study, developed Label Identification Text embedding (DLI-IT) text-based patterns historical data detection suspicious DLI-IT, first trained Connectionist Proposal Network (CTPN) crop raw image into sub-images based text. The texts cropped are recognized independently Tesseract OCR Engine combined as one document each image. Finally, applied universal sentence transform these documents vectors find most similar reference images test cosine similarity. We DLI-IT 1749 opioid 2365 non-opioid was then tested 300 external images, result demonstrated our achieves up-to 88% precision identification, which outperforms previous method by 35% improvement. To conclude, combining analysis under framework, approach achieved competitive performance advancing identification.

参考文章(23)
Tomas Mikolov, Greg S. Corrado, Kai Chen, Jeffrey Dean, Efficient Estimation of Word Representations in Vector Space international conference on learning representations. ,(2013)
Karen Simonyan, Andrew Zisserman, Very Deep Convolutional Networks for Large-Scale Image Recognition computer vision and pattern recognition. ,(2014)
Jiang Wang, Yang Song, Thomas Leung, Chuck Rosenberg, Jingbin Wang, James Philbin, Bo Chen, Ying Wu, Learning Fine-Grained Image Similarity with Deep Ranking computer vision and pattern recognition. pp. 1386- 1393 ,(2014) , 10.1109/CVPR.2014.180
R. Smith, An Overview of the Tesseract OCR Engine international conference on document analysis and recognition. ,vol. 2, pp. 629- 633 ,(2007) , 10.1109/ICDAR.2007.4376991
Suresh Kumar Nagarajan, Content-based Medical Image Annotation and Retrieval using Perceptual Hashing Algorithm IOSR Journal of Engineering. ,vol. 02, pp. 814- 818 ,(2012) , 10.9790/3021-0204814818
Ji Wan, Dayong Wang, Steven Chu Hong Hoi, Pengcheng Wu, Jianke Zhu, Yongdong Zhang, Jintao Li, Deep Learning for Content-Based Image Retrieval: A Comprehensive Study acm multimedia. pp. 157- 166 ,(2014) , 10.1145/2647868.2654948
Dimosthenis Karatzas, Lluis Gomez-Bigorda, Anguelos Nicolaou, Suman Ghosh, Andrew Bagdanov, Masakazu Iwamura, Jiri Matas, Lukas Neumann, Vijay Ramaseshan Chandrasekhar, Shijian Lu, Faisal Shafait, Seiichi Uchida, Ernest Valveny, ICDAR 2015 competition on Robust Reading international conference on document analysis and recognition. pp. 1156- 1160 ,(2015) , 10.1109/ICDAR.2015.7333942
Mohit Iyyer, Varun Manjunatha, Jordan Boyd-Graber, Hal Daumé III, Deep Unordered Composition Rivals Syntactic Methods for Text Classification Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). ,vol. 1, pp. 1681- 1691 ,(2015) , 10.3115/V1/P15-1162