G. Aivatoglou, M. Anastasiadis, G. Spanos, A. Voulgaridis, K. Votis and D. Tzovaras
2021 IEEE International Conference on Cyber Security and Resilience (CSR), Rhodes, Greece
Software Vulnerability categorization, Cyber-security, Machine Learning, Decision Trees, Random Forests, Gradient Boosting
Software vulnerabilities have become a major problem for the security analysts, since the number of new vulnerabilities is constantly growing. Thus, there was a need for a categorization system, in order to group and handle these vulnerabilities in a more efficient way. Hence, the MITRE corporation introduced the Common Weakness Enumeration that is a list of the most common software and hardware vulnerabilities. However, the manual task of understanding and analyzing new vulnerabilities by security experts, is a very slow and exhausting process. For this reason, a new automated classification methodology is introduced in this paper, based on the vulnerability textual descriptions from National Vulnerability Database. The proposed methodology, combines textual analysis and tree-based machine learning techniques in order to classify vulnerabilities automatically. The results of the experiments showed that the proposed methodology performed pretty well achieving an overall accuracy close to 80%.