Selective harvesting over networks

作者: Fabricio Murai , Diogo Rennó , Bruno Ribeiro , Gisele L. Pappa , Don Towsley

DOI: 10.1007/S10618-017-0523-0

关键词: Machine learningArtificial intelligenceComputer scienceData miningNetwork topologyDuality (optimization)Network sizeStochastic processTunnel visionPairwise comparisonClassifier (UML)Budget constraint

摘要: Active search on graphs focuses collecting certain labeled nodes (targets) given global knowledge of the network topology and its edge weights (encoding pairwise similarities) under a query budget constraint. However, in most current networks, nodes, topology, size, are all initially unknown. In this work we introduce selective harvesting, variant active where next node to be queried must chosen among neighbors set; available training data for deciding which is restricted subgraph induced by set (and their attributes) (without any or attributes). Therefore, harvesting sequential decision problem, decide at each step. A classifier trained scenario can suffer from what call tunnel vision effect: without recourse independent sampling, urge only promising forces classifiers gather increasingly biased data, show significantly hurts performance methods standard classifiers. We demonstrate that it possible collect much larger targets using multiple classifiers, not combining predictions as weighted ensemble, but switching between used step, way ease effect. discover collects more (a) diversifying (b) broadening choices future. This highlights an exploration, exploitation, diversification trade-off our problem goes beyond exploration exploitation duality found classic problems. Based these observations propose D $$^3$$ TS, method based multi-armed bandits non-stationary stochastic processes enforces diversity, outperforms competing five real datasets evaluation exhibits comparable other two.

参考文章(37)
Richard John Stapenhurst, Diversity, Margins and Non-Stationary Learning [Thesis]. Manchester, UK: The University of Manchester; 2012.. ,(2012)
Wei-Ning Hsu, Hsuan-Tien Lin, Active learning by learning national conference on artificial intelligence. pp. 2659- 2665 ,(2015)
Robert Tibshirani, Trevor Hastie, Jerome H. Friedman, The Elements of Statistical Learning ,(2001)
Kanthi Sarpatwar, Samir Khuller, Manish Purohit, Analyzing the Optimal Neighborhood: Algorithms for Budgeted and Partial Connected Dominating Set Problems arXiv: Data Structures and Algorithms. ,(2013)
Ludmila I. Kuncheva, That Elusive Diversity in Classifier Ensembles iberian conference on pattern recognition and image analysis. ,vol. 2652, pp. 1126- 1138 ,(2003) , 10.1007/978-3-540-44871-6_130
Alexander G. Gray, Ravi Ganti, Building Bridges: Viewing Active Learning from the Multi-Armed Bandit Lens arXiv: Learning. ,(2013)
Simon Haykin, Jose C. Principe, Weifeng Liu, Kernel Adaptive Filtering: A Comprehensive Introduction ,(2010)
Gautam Pant, Padmini Srinivasan, Learning to crawl: Comparing classification schemes ACM Transactions on Information Systems. ,vol. 23, pp. 430- 462 ,(2005) , 10.1145/1095872.1095875
Andrew I. Schein, Lyle H. Ungar, Active learning for logistic regression: an evaluation Machine Learning. ,vol. 68, pp. 235- 265 ,(2007) , 10.1007/S10994-007-5019-5
K. Avrachenkov, P. Basu, G. Neglia, B. Ribeiro, D. Towsley, Pay few, influence most: Online myopic network covering international conference on computer communications. pp. 813- 818 ,(2014) , 10.1109/INFCOMW.2014.6849335