Ready to Start Your Career?

What Is Adversarial Machine Learning?

Samia Oukemeni's profile image

By: Samia Oukemeni

August 3, 2021

Machine learning (ML) has been developed at an amazing speed over the past few years. It has led to transformational fields and introduced new capabilities. Machine learning algorithms have been used in different emerging applications that impact and will impact society in the years to come, such as self-driving cars, speech recognition, data analytics, sentiment analysis, and robotics, to name a few.

Cybersecurity has benefited greatly from the advancements of ML in all anomaly-based detection, identifying malware, spams, fraudulent transactions, rogue detection, network intrusions, and other malicious activities.

However, machine learning systems are vulnerable to non-obvious and potentially dangerous manipulation, which occurs when an opponent can modify their input data. These types of attacks are called adversarial machine learning attacks (AML). AML attack is a technology that uses malicious input to fool or mislead models in machine learning. An attacker can inject the algorithm and cause the machine learning system to fail.

What is Machine Learning?

Machine learning can be defined as a branch of AI (artificial intelligence) and data science. It is used to automatically imitate how humans learn and apply it to machines to learn and improve from experience without human intervention.

Different types of ML can be categorized into the following main categories:

  • Supervised Learning is used when the machine is provided with a ton of labeled information to train and generate a model. This model will be later used to classify the input.

For example - Decision Trees, Nearest Neighbours, Support Vector Machines, and Neural Networks.

  • Unsupervised Learning, as its name indicates, does not require help from the user to learn; instead, the machine identifies patterns in the data that are not obvious to the human eye.

For example, k-means for clustering problems and Apriori algorithm for association rule learning problems.

  • Reinforcement learning is when the algorithm learns from its interactions with the environment. This is the closed type of ML to human behavior.

For example - Q-Learning, Temporal Difference, and Deep Adversarial Networks.

When ML systems fail

To understand the issue of AML, let us first understand how ML algorithms are used and then discuss a popular ML model. In supervised learning, we train a system using labeled data. For example, in a spam detector, we label a set of emails as spam or legitimate. The ML algorithm then produces a classifier that takes as an input the unlabeled email messages and classifies them as legitimate or spam. In this case, adversaries can create an adversarial training set. Instead of sending “typical” spam, they might send a “tricky” spam email designed to make the classifier misbehave (see figure 1). adversarial ML

Figure 1: Adversarial ML

Attacking a Machine Learning System

Machine learning models can be attacked in many ways, like hijacking their use, functionality bypassed, or revealing their secrets. An attacker can attack either during the training phase of the model or during the testing phase.

  1. Training Phase: The attacker will attempt to influence or corrupt the model directly by making changes to the data used for training. There are three main attack strategies to access the training data:
  • Data Injection: Without knowing the model or direct access to the data, the attacker can insert counterfeit samples into the data set.
  • Data Modification: The attackers can have full access to the training data and modify the data before it is used to train the target model.
  • Logic corruption: The attacker targets the learning algorithm itself. These attack types are difficult to counter, yet they have the most impact because the algorithms are modified.
  1. Testing Phase: In this case, the model will be forced to provide incorrect outputs. Attacks in this phase require information about the model. There are two cases: black-box and white-box attacks.
  • Black-box attacks: The attacker does not have access to the model itself, but they can make predictions on new data.
  • White-box attacks: The attackers have access to the entire model. The attacker has information about the algorithm used in training and can access the training data.

AML Attack Scenarios

During an AML attack, to corrupt the target model, an adversary can manipulate either the data collection or its processing. There are three main attack scenarios:

  • Evasion Attack: It happens when an adversary tries to evade the system by adjusting malicious samples during the testing phase.
  • Poisoning Attack: It is also known as the contamination of the training data. An adversary tries to poison the training data by injecting “poisonous” samples to compromise the whole learning process.
  • Exploratory Attack: The attacker tries to collect as much knowledge as possible about the learning algorithm and pattern in training data.

Hardening Machine Learning Systems

The good news is that even if adversarial machine learning attacks are multiple, new effective ML algorithms are developed against these types of attacks. The following strategies are used to defend against AML and harden systems.

  • Robustness: The defender can generate multiple adversarial examples and augment the perturbed data while training the targeted model.
  • Feature Squeezing: The defender can reduce the complexity of representing the data so that the adversarial perturbations disappear because of low sensitivity.
  • Defense-GAN: The defender leverages the power of Generative Adversarial Networks to reduce the efficiency of adversarial perturbation. GANs are a clever way to train a generative model by denoting the problem as a supervised learning problem with two sub-models: the trained generator model generates new examples, and the discriminator model attempts to classify the examples. The two models are trained together in a contradictory zero-sum game until the discriminator model is fooled about half of the time, which means that the generator model generates plausible examples.


Despite their high accuracy and effectiveness, machine learning algorithms have been found vulnerable to perturbations that can have disastrous consequences. Given the widespread usage of ML in real-world and sensitive applications, we need machine learning algorithms to perform better even in an adversarial environment. While various researches have been developed against AML, more work is needed before machine learning can achieve its full capabilities in improving cybersecurity algorithms.

More read:

Schedule Demo