Structure and Dynamics of Research Collaboration in Computer Science.

作者: Premkumar T. Devanbu , Vladimir Filkov , Zhendong Su , Christian Bird , Andre Nash

DOI:

关键词:

摘要: Complex systems exhibit emergent patterns of behavior at different levels organization. Powerful network analysis methods, developed in physics and social sciences, have been successfully used to tease out that relate community structure dynamics. In this paper, we mine the complex collaboration relationships computer science, adapt these methods study interdisciplinary research individual, within-area network-wide levels. We start with a graph extracted from DBLP bibliographic database use extrinsic data define areas within science. Using topological measures on graph, find significant differences individuals among based their patterns. analysis, betweenness centralization, longitudinal assortativity as metrics each area determine how centralized, integrated, cohesive they are. Of special interest is change time. longitudinally examine overlap migration authors, empirically confirm some science folklore. also degree which key conferences are interdisciplinary. mining software engineering very while theory cryptography not. Specifically, it appears SDM ICSE attract authors who publish many FOCS STOC do isolation both between areas. One interesting discovery highly isolated larger community, but densely interconnected itself. 1 Background Motivation Computer diverse growing scholarly activity, subareas, such artificial intelligence (AI), computational biology (CBIO), (CRYPTO), databases (DB), graphics (GRAPH), programming languages (PL), (SE), security (SEC), (THEORY), others. Some quite old, rooted earliest stirrings field (e.g., THEORY) others started much later GRAPH). large, attracting large number researchers DB GRAPH) smaller CRYPTO SE). stable phase THEORY); rapidly SEC). There other, more subtle character style These differences, although currently not rigorously quantified, nevertheless may important implications for future recognized by working respective (or closely allied) areas, studied. For example, considered intellectually unified, said include several distinct, thriving groups. tend interact strongly others, tradition mutual enrichment, stand-alone. dominated few researchers, diffuse collaborative structure. older younger frequently collaborate, collaborate primarily like them. informal, folkloric worthy study, because properties clearly can strong influence intellectual vibrancy diversity an area. begin quantify produce provide “actionable intelligence" interested parties. (students, new faculty) might well consider factors when deciding whether enter leave) Funding agencies (industries, government foundations) status field, choose formulate Broad Area Announcements (BAAs) Calls Proposals become interdisciplinary, or diverse, spread funding broadly increase centers influence; contrariwise, could design initiatives reverse trends if seems appropriate. How put sounder, quantitatively rigorous footing? claim solution lies range quantitative statistical physics. analyze identify fragmented, fewer so on. two classes metrics: (1) one class characterizes styles, (2) other entire all Both experimental studies lead observations match beliefs intuitions about fields, indicate surprising, perhaps worrisome, Result Summary: This paper makes following contributions: • bring set mathematical evolution centrality networks, principal component publishing Our work illustrates ability going beyond Pareto distributions authorship scale-free collaborations earlier noticed. introduce novel overlap. compare using indicators style: fields? Are there welldefined sub-areas field? (PL) (SE) whereas AI architecture (ARCH) (DB) remarkably well-integrated, without subgroups, area, surprisingly fragmented. Do dominate areas? marked assortative (where them)? find, go through periods where gradually evolve network. However, area—security (SEC)— fact shows increasing dominance researchers. argue worrisome trend vital national importance. notice be than should raise concerns health [12, 37]. Paper Outline: The rest structured follows. After discussing related (Section 2), present our method collection 3). particular, discuss divide into extract networks data. Section 4 presents styles findings, 5 results interrelate terms author migration. conclude 6 discussion possible directions further investigation.

参考文章(39)
Michael Ley, Patrick Reuther, Maintaining an Online Bibliographical Database: The Problem of Data Quality. EGC. pp. 5- 10 ,(2006)
Jian Huang, Seyda Ertekin, C. Lee Giles, Efficient Name Disambiguation for Large-Scale Databases Lecture Notes in Computer Science. ,vol. 4213, pp. 536- 544 ,(2006) , 10.1007/11871637_53
M. Girvan, M. E. J. Newman, Community structure in social and biological networks Proceedings of the National Academy of Sciences of the United States of America. ,vol. 99, pp. 7821- 7826 ,(2002) , 10.1073/PNAS.122653799
Eric D. Widmer, Social Capital in Wide Family Contexts: An Empirical Assessment Using Social Network Methods International Review of Sociology. ,vol. 17, pp. 225- 238 ,(2007) , 10.1080/03906700701356861
B.K. Mohan, Searching association networks for nurturers IEEE Computer. ,vol. 38, pp. 54- 60 ,(2005) , 10.1109/MC.2005.351
M. E. J. Newman, Analysis of weighted networks. Physical Review E. ,vol. 70, pp. 056131- ,(2004) , 10.1103/PHYSREVE.70.056131
Yookyung Jo, Carl Lagoze, C. Lee Giles, Detecting research topics via the correlation between graphs and texts Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '07. pp. 370- 379 ,(2007) , 10.1145/1281192.1281234
Deng Cai, Zheng Shao, Xiaofei He, Xifeng Yan, Jiawei Han, Mining hidden community in heterogeneous social networks Proceedings of the 3rd international workshop on Link discovery - LinkKDD '05. pp. 58- 65 ,(2005) , 10.1145/1134271.1134280
Aaron Clauset, Cosma Rohilla Shalizi, M. E. J. Newman, Power-Law Distributions in Empirical Data Siam Review. ,vol. 51, pp. 661- 703 ,(2009) , 10.1137/070710111