作者: Yiming Zhang , Yujie Fan , Shifu Hou , Yanfang Ye , Xusheng Xiao
DOI: 10.1109/ICBK50248.2020.00071
关键词:
摘要: As the largest source code repository, GitHub has played a vital role in modern social coding ecosystem to generate production software. Despite apparent benefits of such paradigm, its potential security risks have been largely overlooked (e.g., malicious codes or repositories could be easily embedded and distributed). To address this imminent issue, paper, we propose novel framework (named GitCyber) automate repository detection at first attempt. In GitCyber, extract contents from hosted as inputs for deep neural network (DNN), then incorporate cybersecurity domain knowledge modeled by heterogeneous information (HIN) design cyber-guided loss function learning objective DNN assure classification performance while preserving consistency with observational knowledge. Comprehensive experiments based on large-scale data collected demonstrate that our proposed GitCyber outperforms state-of-the-arts detection.