Combining text analysis techniques with unsupervised machine learning methodologies for improved software vulnerability management

Author

M. Anastasiadis, G. Aivatoglou, G. Spanos, A. Voulgaridis and K. Votis

Published in

2022 IEEE International Conference on Cyber Security and Resilience (CSR), Rhodes, Greece

Keywords

Software Vulnerability categorization, Cybersecurity, Machine Learning, Clustering

Open Access

Abstract

Software vulnerability management constitutes a prominent research area for security analysts and researchers. One of the main pillars of the software vulnerability management is the grouping of vulnerabilities that have similar characteristics in order for the security analysts to organize more efficiently prevention and mitigation actions. For this reason, the proposed research study suggests an automated vulnerability grouping from technical descriptions based on unsupervised machine learning techniques such as Latent Dirichlet Allocation and K-means along with text analysis techniques. The results of the aforementioned methodology in a large vulnerability dataset (over 100.000 vulnerabilities) confirmed that this vulnerability clustering from the corresponding descriptions could assist in software vulnerability group homogeneity and in the simplicity of the vulnerability management procedures.

Source

https://ieeexplore.ieee.org/abstract/document/9850314