Arnold Spyros, Angelos Papoutsis, Ilias Koritsas, Notis Mengidis, Christos Iliou, Dimitris Kavallieros, Theodora Tsikrika, Stefanos Vrochidis, Ioannis Kompatsiaris
2022 IEEE International Conference on Cyber Security and Resilience (CSR)
cyber threat intelligence, machine learning, honeypots, ensemble methods, Wazuh
Cyber Threat Intelligence helps organisations in their fight against cyber threats to strategically design their defences and support decision making by continuously providing information regarding the cyber threat landscape. In this context, honeypots are a widespread solution for gathering intelligence about threat actors. However, honeypots do not inherently provide information about the origin of threat groups, their resources, capabilities, and potential impact. Thus, we propose an approach that classifies threats, as highly or less abusive, based on their behaviour characteristics using four ensemble machine learning algorithms applied on security incidents identified in a rule-based manner on a deployed honeypot. After prepossessing and hyper-tuning of the parameters, the four examined models, Random Forest Classifier (RFC), Adaptive Boosting Classifier (AdaBoost), Light Gradient Boosting Machine (LGBM) and Extreme Gradient Boosting (XGBoost), achieve good results, with RFC and LGBM achieving the best recall (84%, 83%) and LGBM and XGB the best AUC (91%, 90%).