Researchers demonstrate that malware can be hidden inside AI models

  News
image_pdfimage_print
This photo has a job application for Boston University hidden within it. The technique introduced by Wang, Liu, and Cui could hide data inside an image classifier rather than just an image.
Enlarge / This photo has a job application for Boston University hidden within it. The technique introduced by Wang, Liu, and Cui could hide data inside an image classifier rather than just an image.

Researchers Zhi Wang, Chaoge Liu, and Xiang Cui published a paper last Monday demonstrating a new technique for slipping malware past automated detection tools—in this case, by hiding it inside a neural network.

The three embedded 36.9MiB of malware into a 178MiB AlexNet model without significantly altering the function of the model itself. The malware-embedded model classified images with near-identical accuracy, within 1% of the malware-free model. (This is possible because the number of layers and total neurons in a convolutional neural network is fixed prior to training—which means that, much like in human brains, many of the neurons in a trained model end up being either largely or entirely dormant.)

Just as importantly, squirreling the malware away into the model broke it up in ways that prevented detection by standard antivirus engines. VirusTotal, a service that “inspects items with over 70 antivirus scanners and URL/domain blocklisting services, in addition to a myriad of tools to extract signals from the studied content,” did not raise any suspicions about the malware-embedded model.

The researchers’ technique chooses the best layer to work with in an already-trained model and then embeds the malware into that layer. In an existing trained model—for example, a widely available image classifier—there may be an undesirably large impact on accuracy due to not having enough dormant or mostly dormant neurons.

If the accuracy of a malware-embedded model is insufficient, the attacker may choose instead to begin with an untrained model, add a lot of extra neurons, and then train the model on the same data set that the original model used. This should produce a model with a larger size but equivalent accuracy, plus the approach provides more room to hide nasty stuff inside.

The good news is that we’re effectively just talking about steganography—the new technique is a way to hide malware, not execute it. In order to actually run the malware, it must be extracted from the poisoned model by another malicious program and then reassembled into its working form. The bad news is that neural network models are considerably larger than typical photographic images, offering attackers the ability to hide far more illicit data inside them without detection.

Cybersecurity researcher Dr. Lukasz Olejnik told Motherboard that he didn’t think the new technique offered much to an attacker. “Today, it would not be simple to detect it by antivirus software, but this is only because nobody is looking.” But the technique does represent yet another way to potentially smuggle data past digital sentries and into a potentially less-protected interior network.

https://arstechnica.com/?p=1782634