W. M. Rodrigues, F. N. Walmsley, G. D. C. Cavalcanti and R. M. O. Cruz, “Security Relevant Methods of Android’s API Classification: A Machine Learning Empirical Evaluation,” in IEEE Transactions on Computers, doi: 10.1109/TC.2023.3291998.
The Android operating system provides functions and methods to handle sensitive data to secure users’ data. The Android security literature extracts binary features from a method and classifies the method into one of the Security Relevant Method’s classes, adding information about how the method handles sensitive data. However, the usage of binary features hinders the performance of some classifiers due to the high collision rate between instances. Although previous works have explored Security Relevant Method classification, an extensive study of machine learning algorithms over this problem has not been conceived. This work fills this gap, analyzing Monolithic classifiers, Multiple Classifier Systems, and Embedding algorithms to transform binary features into real-valued features, aiming to facilitate the classifier’s work by minimizing the ambiguity promoted by the collision. Our analyzes show that META-DES, using a pool of Decision Trees trained with the Random Forest algorithm, statistically has the best results. We also find that, in general, distance-based classifiers have a disadvantage in binary features. Moreover, embedding techniques such as deep metric learning with triplet loss can reduce geometrical instance ambiguity, improving the performance of the weakest learning algorithms. However, its usage was detrimental to the performance of more robust techniques, such as dynamic ensemble models better suited for handling difficult cases. The dataset and code used for the experiments are available in the following repository: https://github.com/walbermr/android-srm-ml-evaluation.
Walber M. Rodrigues, Centro de Informática, Universidade Federal de Pernambuco, Recife, Brazil
Felipe N. Walmsley, Centro de Informática, Universidade Federal de Pernambuco, Recife, Brazil
George D. C. Cavalcanti, Centro de Informática, Universidade Federal de Pernambuco, Recife, Brazil
Rafael M. O. Cruz, Department of software engineering, École de Technologie Supérieure, Montreal, Quebec, Canada