摘要: Malware detection has witnessed a rapid transition from manual signature release to fully automation in recent years. In particular, with the accumulation of huge malware sample sets, machine learning (ML) and deep (DL) have been proposed for verdict predicting family attribution. Despite high accuracy efficiency, existing proposals fall short providing explanation their results. To fill gap between classification decisions reasoning behind, we propose Galaxy, generic approach automatic generation. Briefly, Galaxy selects meaningful metadata fields static dynamic analysis reports given samples. Based on selected fields, all input samples will be clustered into groups according similarity measurement. The observed similarities then converted patterns validated against multiple intelligence sources decide whether it is suitable detection. end, launches refine process improve grouping results increase coverage. We applied framework daily incoming Android our WildFire production since September 2016. Up know, generated more than 12,500 unique signatures covering total 1.75 million malwares. Those provided valuable insights discovery undocumented malicious domains identification Communication & Control (C&C) servers. Because rigid quality requirement, released proven cause no false positives production.