作者: Yanbin Wang , Weifan Zhu , Haitao Xu , Zhan Qin , Kui Ren
DOI:
关键词:
摘要: Phishing attacks have always been a security issue that has attracted great attention in the cyber security community. Recently, the famous pre-trained models is being used as an anti-phishing solution. However, existing studies either simply transfer models pre-trained on text to phishing detection task, or pre-train models using only extremely small phishing samples. In this paper, we propose PhishBERT, a veritable pretrained deep transformer network model for phishing URL detection. Using a tailor pre-training objective, PhishBERT obtained a general understanding of various URLs by being pretrained on a corpus of more than 3 billion unlabeled URL data. It is then transferred to the detection task of benign and malicious URL data, with supervised fine-tuning using adversarial methods. Extensive and rigorous benchmark studies verify that PhishBERT is significantly superior to the current state-of-the-art …