By: Pierluigi Riti
September 16, 2020
Introduction to Deepfake
By: Pierluigi Riti
September 16, 2020
As suggested by the term, deepfake is a form of synthetic media in which an existing video or picture of a person is swapped with someone else's. This has become very easy nowadays. For only a couple of dollars per hour, a GPU can be rented, where it is possible to train the network and build a Deepfake. Deepfakes have become a huge problem in recent years. In an article from the BBC, dated 2018, the author shows how Deepfake porn can have serious implications. This begs the question, how is a deepfake made? In this article, we try to answer that basic question and explore the complexity of crafting a deepfake.
The Risk of Deepfake
To better understand the impact of the deepfake, we can take a look at this CNN article: https://edition.cnn.com/interactive/2019/01/business/pentagons-race-against-deepfakes/.
The article shows the power and the danger of the deepfake image, and at the same time, how difficult it is to identify a deepfake by sight. In recent years, deepfake techniques are used to create a fake video around a politician. A recent case was the video released by a Belgium political group: https://tube.rebellion.global/videos/watch/2ad12b6b-bb53-473c-ad74-14eef02874b5?title=0&warningTitle=0. The video is a fake and it shows the destructive potential of this technology. At the heart of a Deepfake video is Machine Learning, particularly with the growing popularity of the Generative Adversarial Network (GAN), invented in 2014 by Ian Goodfellow as a part of his Ph.D. study. Based on a study made by the startup https://sensity.ai/ at the end of 2019, there are over 14 thousand Deepfake videos on the internet, 96% of pornography.
What is a GAN?
GANs are one of the most interesting evolutions of Machine Learning. The normal Neural Network can only classify an object, such as an image, but GAN can create a new object. GANs can generate new content. GANs use two Neural Networks, one against each other; one network is called the "Generator," the other is called the "Discriminator." GANs can be used with unsuppressed and semi-suppressed learning. In the paper "Generative Adversarial Networks: An Overview," we can use an interesting analogy to understand how a GAN works. We can think of a Generator as a forger, and the Discriminator is an expert. The Generator creates a replica of the original and then asks the Discriminator to identify a real or fake image. One important point we need to understand about the GANs is the Generator doesn't have access to the original image; only the Discriminator does. The Discriminator produces a signal error when it's made the conflict between the original and the fake one. The error is used by the Generator to correct the error with the Generated image. This capacity allows the Generator to "learn" from its mistake and then produce a better quality image.
Deepfake Generation and Detection
With the Deepfake becoming a serious problem, many Scientists and Engineers are looking for a way to detect Deepfakes. Two very interesting articles on this subject are "CNN-generated images are surprisingly easy to spot... for now"  and "Disrupting Deepfakes: Adversarial Attacks Against Conditional Image Translation Networks and Facial Manipulation Systems" .
The two articles show different techniques and procedures to identify and understand if an image is real or fake. The articles show different architecture used to build a Deepfake system, the most commonly used architectures are ProGAN, StyleGAN, BigGAN, CycleGAN, StarGAN, and GauGAN., All of these specific networks need to be trained, and the main goal is to identify some specific fingerprints used to identify fake images. The complexity behind that is to train the network to be used for "general" purposes because it can be trained to use and synthesize a specific set of images. For example, we can train the network to recognize and generate fake images based on a car but can react poorly when trying to use the same GAN to generate a deepfake with a face. Besides, most of the network is trained to improve or add one specific part of the image; for example, they can only add a smile on the face.
Any network architecture used to generate a Deepfake image can have weaknesses, and they can be used to identify if the image is real or not. For example, when we create a fake image with the StyleGAN, we often insert a Blob artifact in the image; this is a sign of an artificial image. This issue was fixed on StyleGAN2, but in any case, the image has some flaws. Most of this can not be spotted by the naked eye but can be spotted when analyzing the pixel level.
To make a perfect image is not always easy or possible. Imagine when you mix two images, one lighting can be different from another, which means the software needs to adjust the luminosity to make the two images similar. For generating a deepfake, the GAN starts with an original image and merges it with another image. The Generator needs to learn how to adapt the two images' luminosity, in which the software blends the image with another. This technique can leave some issues, in particular, with the luminosity of the image. A different approach and technique are to identify the image's landmarks and then adjust the new face on the old one; the software needs to adjust the luminosity and other values to generate the new image. When the image is repositioned, we need to use a Gaussian Blur on the new image to smooth both images' contact points.
The images generated by a GANs are not, for now, perfect. There are some basic "mistakes" that can be easily identified during a Forensic analysis. The first problem we can encounter is the image is Blurry. In this case, the image generated is unnaturally blurry because it needs to be blurred to smooth the border. Another reason is the training with low-images of the network. Another problem can be the skin-tone; this problem can happen when we try to mix faces with different skin tones and be easily spotted. Other issues are mostly connected with a wrong landmark or shadowing. For example, we can have an image with two eyebrows or a double chin. All of these issues can be easily spotted with image analysis.
The Deepfake images and videos are becoming a serious concern, especially when the Deepfake images are used for political influence. Governments worldwide start to create a task force to fight the Deepfake and, of course, the fake news connected with the fake images. Deepfake images are also becoming an area of concern for Cybersecurity and Forensic experts. The experts need to correctly identify if the image is a true or false image during an investigation. Creating a Deepfake video today is quite simple. It is possible to rent a GPU for a couple of dollars per hour, and we can create our simple Deepfake video. At the moment, the GANs and their architecture have some weaknesses; this gives the Engineer/Scientist the capacity to create software to analyze images and discover their authenticity.