Logo image
PyRHOH: A meta-learning analysis framework for determining the impact of compilation on malicious JavaScript identification
Conference paper   Open access   Peer reviewed

PyRHOH: A meta-learning analysis framework for determining the impact of compilation on malicious JavaScript identification

Eli Fulkerson, Eric Yocam, Varghese Vaidyan, Mahesh Kamepalli, Yong Wang and Gurcan Comert
Machine learning with applications, Vol.21, 100724
Elsevier
09/2025

Abstract

Bytecode Compilation JavaScript LSTM Malicious code detection Obfuscation PyTorch
Automated identification of malicious JavaScript is a core problem within modern malware analysis. Code obfuscation is a common tactic used to evade detection. This obfuscation hinders both manual and automated detection methods, including neural network techniques. In order for these methods to effectively classify malware, it is beneficial to reduce the effects of obfuscation as well as to optimize the configuration and structure of the neural network to be well suited for the task. To overcome these challenges, we present a new framework: “PyRHOH” (“Python Repeatable Hyperparameter Optimization Harness”), a meta-learning framework that implements Bayesian optimization. The automated exploration and maximization of candidate hyperparameters using a Bayesian method adds structure and rigor to the selection of neural network hyperparameters, providing the assurance that an implemented design is optimal. In this study, we used the PyRHOH framework to determine optimal recurrent neural network architectures for the differentiation of malicious and benign JavaScript samples. We then used these neural networks to measure the degree to which compilation of raw JavaScript samples into bytecode via Google’s V8 JavaScript compiler affected classification accuracy. Classifying in-the-wild samples, compilation increased the detection rate from 76.88% to 95.84%. Among uniformly obfuscated samples, compilation increased the detection rate from an average of 76.76% to an average of 91.24% e compilation was performed. This shows that pre-processing JavaScript into compiled bytecode has a clear positive impact on neural network categorization.
url
Article Landing PageView
Published (Version of record) Open

Metrics

1 Record Views

Details

Logo image