Automatic F-term Classification of Japanese Patent Documents Using the k-Nearest Neighborhood Method and the SMART Weighting

作者： Masaki Murata , Toshiyuki Kanamaru , Tamotsu Shirado , Hitoshi Isahara

关键词:

摘要: Patent processing is important in various fields such as industry, business, and law. We used F-terms (Schellner 2002) to classify patent documents using the k-nearest neighborhood method. Because F-term categories are fine-grained, they useful when we documents. clarified following three points experiments: i) which variations of method best for classification, ii) methods calculating similarity iii) from regions a terms should be extracted. In our experiments, data categorization task NTCIR-5 Workshop (NTCIR committee 2005; Iwayama, Fujii, Kando 2005). found that adding scores k extracted was most effective among this study. also SMART (Singhal, Buckley, Mitra 1996; Singhal, Choi, Hindle, Pereira 1997), known information retrieval, similarity. Finally, extracting terms, abstract claim together all combinations abstract, claim, description regions. The results were confirmed statistical test. Moreover, experimented with changing amount training obtained better performance more data, limited provided Workshop.

参考文章(14)

Makoto Iwayama, Noriko Kando, Atsushi Fujii, Overview of Classification Subtask at NTCIR-5 Patent Retrieval Task. NTCIR. ,(2005)

Seishi Okamoto, Ken Satoh, An Average-Case Analysis of k-Nearest Neighbor Classifier international conference on case based reasoning. pp. 253- 264 ,(1995) , 10.1007/3-540-60598-3_23

Gongde Guo, Hui Wang, David Bell, Yaxin Bi, Kieran Greer, An kNN Model-based Approach and its Application in Text Categorization conference on intelligent text processing and computational linguistics. pp. 559- 570 ,(2004) , 10.1007/978-3-540-24630-5_69

Nello Cristianini, John Shawe-Taylor, An Introduction to Support Vector Machines and Other Kernel-based Learning Methods ,(2000)

Amit Singhal, Chris Buckley, Manclar Mitra, Pivoted document length normalization international acm sigir conference on research and development in information retrieval. ,vol. 51, pp. 21- 29 ,(1996) , 10.1145/3130348.3130365

Yiming Yang, Xin Liu, A re-examination of text categorization methods international acm sigir conference on research and development in information retrieval. pp. 42- 49 ,(1999) , 10.1145/312624.312647

C. J. Fall, A. Törcsvári, K. Benzineb, G. Karetka, Automated categorization in the international patent classification international acm sigir conference on research and development in information retrieval. ,vol. 37, pp. 10- 25 ,(2003) , 10.1145/945546.945547

Makoto Iwayama, Atsushi Fujii, Noriko Kando, Yozo Marukawa, Evaluating patent retrieval in the third NTCIR workshop formal methods. ,vol. 42, pp. 207- 221 ,(2006) , 10.1016/J.IPM.2004.08.012

Leah S. Larkey, A patent search and classification system acm international conference on digital libraries. pp. 179- 187 ,(1999) , 10.1145/313238.313304

10.

Irene Schellner, Japanese File Index classification and F-terms World Patent Information. ,vol. 24, pp. 197- 201 ,(2002) , 10.1016/S0172-2190(02)00019-4

Automatic F-term Classification of Japanese Patent Documents Using the k-Nearest Neighborhood Method and the SMART Weighting

来源期刊

我的账户

Automatic F-term Classification of Japanese Patent Documents Using the k-Nearest Neighborhood Method and the SMART Weighting

来源期刊

相似文章 2

Using the Multi-level Classification Method in the Patent Mining Task at NTCIR-7.

Using the K-Nearest Neighbor Method and SMART Weighting in the Patent Document Categorization Subtask at NTCIR-6.

我的账户