Authors: Chao Liu (Chinese Academy of Sciences), Chenzhe Lou (Chinese Academy of Sciences), Min Yu (Chinese Academic of Sciences), SM Yiu (University of Hong Kong), KP Chow (University of Hong Kong), Gang Li (Deakin University), and Jianguo Jiang (Chinese Academy of Sciences)



PDF malware remains as a major hacking technique. To distinguish malicious PDFs from massive PDF files poses a challenge to forensic investigation. Machine learning has become a mainstream technology for malicious PDF document detection either to help analysts in a forensic investigation or to prevent a system being attacked. However, adversarial attacks against malicious document classifiers have emerged. Crafted adversarial example based on precision manipulation may be easily misclassified. This poses a major threat to many detectors based on machine learning techniques. Various analysis or detection techniques have been available for specific attacks. The challenge from adversarial attacks is still not yet completely resolved. A major reason is that most of the detection methods are tailor-made for existing adversarial examples only. In this paper, based on an interesting observation that most of these adversarial examples were designed on specific models, we propose a novel approach to generate a group of mutated cross-model classifiers such that adversarial examples cannot pass all classifiers easily. Based on a Prediction Inversion Rate (PIR), we can effectively identify adversarial example from benign documents. Our mutated group of classifiers enhances the power of prediction inconsistency using multiple models and eliminate the effect of transferability (a technique to make the same adversarial example work for multiple models) because of the mutation. Our experiments show that we are better than all existing state-of-the- art detection methods.

Back to Program