TY - GEN
T1 - Identifying malware genera using the Jensen-Shannon distance between system call traces
AU - Seideman, Jeremy D.
AU - Khan, Bilal
AU - Vargas, Antonio Cesar
N1 - Publisher Copyright:
© 2014 IEEE.
PY - 2014/12/29
Y1 - 2014/12/29
N2 - The study of malware often involves some form of grouping or clustering in order to indicate malware samples that are closely related. There are many ways that this can be performed, depending on the type of data that is recorded to represent the malware and the eventual goal of the grouping. While the concept of a malware family has been explored in depth, we introduce the concept of the malware genus, a grouping of malware that consists of very closely related samples determined by the relationships between samples within the malware population. Determining the boundaries of the malware genus is dependent upon the way that the malware samples are compared and the overall relationship between samples, with special attention paid to the parent-child relationship. Biologists have several criteria that are used to judge the usefulness of a genus when creating a taxonomy of organisms; we sought to design a classification that would be as useful in the world of malware research as it is in biology. We present two case studies in which we analyze a set of malware, using the Jensen-Shannon Distance between system call traces to measure distance between samples. The case studies show the genera that we create adhere to all of the criteria used when creating taxa of biological organisms.
AB - The study of malware often involves some form of grouping or clustering in order to indicate malware samples that are closely related. There are many ways that this can be performed, depending on the type of data that is recorded to represent the malware and the eventual goal of the grouping. While the concept of a malware family has been explored in depth, we introduce the concept of the malware genus, a grouping of malware that consists of very closely related samples determined by the relationships between samples within the malware population. Determining the boundaries of the malware genus is dependent upon the way that the malware samples are compared and the overall relationship between samples, with special attention paid to the parent-child relationship. Biologists have several criteria that are used to judge the usefulness of a genus when creating a taxonomy of organisms; we sought to design a classification that would be as useful in the world of malware research as it is in biology. We present two case studies in which we analyze a set of malware, using the Jensen-Shannon Distance between system call traces to measure distance between samples. The case studies show the genera that we create adhere to all of the criteria used when creating taxa of biological organisms.
UR - http://www.scopus.com/inward/record.url?scp=84922516061&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84922516061&partnerID=8YFLogxK
U2 - 10.1109/MALWARE.2014.6999409
DO - 10.1109/MALWARE.2014.6999409
M3 - Conference contribution
AN - SCOPUS:84922516061
T3 - Proceedings of the 9th IEEE International Conference on Malicious and Unwanted Software, MALCON 2014
SP - 1
EP - 7
BT - Proceedings of the 9th IEEE International Conference on Malicious and Unwanted Software, MALCON 2014
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 9th IEEE International Conference on Malicious and Unwanted Software, MALCON 2014
Y2 - 28 October 2014 through 30 October 2014
ER -