Pavlos Evangelatos Information Technologies Institute, CERTH, Thessaloniki, Greece ; Christos Iliou; Thanassis Mavropoulos; Konstantinos Apostolou; Theodora Tsikrika; Stefanos Vrochidis; Ioannis Kompatsiaris
2021 IEEE International Conference on Cyber Security and Resilience (CSR)
Cyber Threat Intelligence, Named Entity Recognition, DNRTI, BERT, XLNet, RoBERTa, ELECTRA
The continuous increase in sophistication of threat actors over the years has made the use of actionable threat intelligence a critical part of the defence against them. Such Cyber Threat Intelligence is published daily on several online sources, including vulnerability databases, CERT feeds, and social media, as well as on forums and web pages from the Surface and the Dark Web. Named Entity Recognition (NER) techniques can be used to extract the aforementioned information in an actionable form from such sources. In this paper we investigate how the latest advances in the NER domain, and in particular transformer-based models, can facilitate this process. To this end, the dataset for NER in Threat Intelligence (DNRTI) containing more than 300 pieces of threat intelligence reports from open source threat intelligence websites is used. Our experimental results demonstrate that transformer-based techniques are very effective in extracting cybersecurity-related named entities, by considerably outperforming the previous state- of-the-art approaches tested with DNRTI.