A Machine Learning Approach to Dataset Imputation for Software Vulnerabilities

Author

Shahin Rostami, Agnieszka Kleszcz, Daniel Dimanov and Vasilios Katos

Published in

Dziech A., Mees W., Czyżewski A. (eds) Multimedia Communications, Services and Security. MCSS 2020. Communications in Computer and Information Science, vol 1284. Springer, Cham

Keywords

cybersecurity, vulnerability, mitre ATT&CK, machine learning, dataset imputation

Open Access

Abstract

This paper proposes a supervised machine learning approach for the imputation of missing categorical values in a dataset where the majority of samples are incomplete. Twelve models have been designed that can predict nine of the twelve Adversarial Tactics, Techniques, and Common Knowledge (ATT&CK) tactic categories using only the Common Attack Pattern Enumeration and Classification (CAPEC). The proposed method has been evaluated on a test dataset consisting of 867 unseen samples, with the classification accuracy ranging from 99.88% to 100%. These models were employed to generate a more complete dataset with no missing ATT&CK tactic features.

Source

https://link.springer.com/chapter/10.1007/978-3-030-59000-0_3