Human beings tend to believe only that which they can see with their own eyes or that which they can hear. This, of course, is overwritten when it comes to certain aspects of life – one such prominent aspect being religion. In all other facets of life, seeing is quite literally believing. It is, therefore, a crazy thought for what you see and believe to be altered or challenged. Imagine not having the confidence to trust your own eyes and ears. This is the problem presented to today’s media consumers, where discerning between what’s real and what is fake is becoming increasingly difficult. Introducing Deepfakes…
Deepfakes are a far cry from the traditional methods of photo or video manipulation – decent computer and photoshop skills don’t make the cut. This is purely machine learning and Artificial Intelligence at work. Gone are the days when AI used to be a buzzword to sell tech products. It now conjures up real-life faces and very believable videos, except that none of it is actually true.
History of Deepfakes
Editing or doctoring photographs and footage, mostly for nefarious purposes has been going on for quite a while. Deepfakes, however, employ the use of ML and AI to near-perfect outcomes, potentially indistinguishable from fakes when using just the human eye. By definition, deepfakes, a word coined in 2017 on a similarly-named user’s subreddit, is combining and superimposing existing images and video onto other image and video sources using a machine learning technique called Generative Adversarial Network, or GAN. It is a technique for human image synthesis based on AI.
The advent of deepfakes is often associated with Reddit users, and it was mainly centered around pornography. This entailed superimposing various celebrity faces onto pornographic videos – this characteristically popularised Deepfakes. One such victim was Wonder Woman’s Gal Gadot. Samantha Cole’s interestingly worded article on Tech by Vice pointed out the terrifying implications of the algorithm. The video attempted to portray Gal Gadot by swapping the real actress’ face for her own. Samantha pointed out how the ML algorithm was created using open-source code that anyone with an interest and good knowledge of deep learning could come up with. The creator went by the name deepfakes, after whom the technique was named. Deepfakes spoke to Samantha about TensorFlow, Google freely available ML tool, which they used to create the video. At first glance, images look believable; only when one takes a closer look does it become apparent that there are some oddities on the face during motion. This, however, was a very believable first attempt – they have only gotten better.
There were some more humorous deepfakes during the period, imposing Nicholas Cage’s face on memes – and as the titular character of most popular films, he became an instant internet sensation. This was just the tiniest of positives from an otherwise jarring phenomenon, and the warning bells started ringing from the onset.
The Technology Behind Deepfakes
Artificial Intelligence has come a long way from its earliest depiction in Sci-Fi movies to the tech-marvel that it is today. Just as Skynet was a huge threat in the Terminator movies, AI poses the same danger in this age if mishandled.
Speaker of the US House of Representatives, Nancy Pelosi, fell victim to a video showing her to appear drunk because of slurred speech. The video was shared millions of times across various social media platforms. The debate thus ensued about the impact of fake media, and deepfakes in particular. In this case, however, deepfakes were not involved as the video was just slowed down and the pitch regulated to give off a drunk vibe – it was an old-school video doctoring.
Deepfakes, in their sophistication, are a combination of two algorithms which comprise the GAN mentioned earlier. The Generative Adversarial Network, therefore, has two algorithms, the Generator and Discriminator. The premise is simple but brilliant. The discriminator is trained on a certain data set where it performs categorizations (real or fake). It determines the attributes of anything categorized as fake, say, sharper edges around doctored areas or contrast differences. As new media is run through the algorithm, it rates them based on the percentage of ‘fake-ness’ according to the previously determined categorization. A larger data set will yield better categorization. On the flip side, the generator algorithm works the other way and tries generating media that is not classified as fake and then runs it through its companion algorithm. This quickly turns into both a competition and a self-improving cycle, where the discriminator gets better at telling fakes, and the generator has to improve in generating undetectable fakes constantly.
The cycle can be seen as brute-force or trial and error cycle, also referred to as deep learning. While the ‘well-trained‘ AI does most of the heavy lifting, anyone wishing to create a deepfake would have some part to play. The type of video would determine the datasets that would be required to have a good result. A video with unaltered audio would just require a target video (preferably clear and clean) and two datasets. These would be of the original and new faces. For a more convincing output, including voice manipulation, one would require a target video with data sets of the target face performing various actions like smiling and talking. One would also need the target’s voice, either recorded or generated by AI like Lyrebird, where one can type words to generate speech. Lastly, one would need AI to lip-sync and tie the whole video together. The result would be a compelling video of someone saying something they never actually said in real-life.
How to Spot a Deepfake
An article on Spectrum details how researchers have shown an algorithm to detect ‘imperceptibly altered’ with promising initial tests. However, there are questions regarding the relevance of such a tool today. Overreliance on a detection algorithm could work against its intended purpose. If such a trusted tool were to be fooled, it would prove even more dangerous to anyone that falls victim. It is also worth noting that a detection system will only be as good as the quality of the data sets it was trained on. With GAN, detection of fakes ultimately works against detection as the generator will only seek to improve its output, making the result even more difficult to detect and eventually fooling the discriminator.
The proposed algorithm is part of recurrent neural networks which works by splitting images into small patches and attempting to match them pixel by pixel. Any discrepancy leads to media being labelled as a potential deepfake. While computers have this capability, the same cannot be said for the human eye, which cannot view media at a pixel level. The eye is, therefore, resigned to looking for halo effects on faces or other small but discernible distortion and imperfections.
Impacts and Implications of Deepfakes
Deepfakes, while funny when used for memes, can be devastating and image-tarnishing when used for wrongful implication or victimization. Sadly, the latter has been more prevalent for the most part. Mark Zuckerberg, Gal Gadot, Putin and Barrack Obama have all appeared in deepfake videos doing or saying things that were completely fabricated. While a trained, informed, and tech-savvy person may quickly see a deepfake for what it actually is, millions of other people will not, and they will believe anything they see at face value. The implications can affect the society on a religious level, purport racism, hatred for certain sexuality, and even start political warfare – let alone actual war between countries.
The viral video of Barrack Obama was created by Jordan Peele in association with BuzzFeed in 2018 as a sort of PSA to the masses about the dangers of the technology. His fake utterances on Donald Trump mirror the many deepfakes that have since emerged of politicians and other social figures saying and doing things that never actually took place. The damage, in some cases, is irreversible. Legislative bodies have therefore been forced to take a stand on the matter. Platforms where such media is distributed, have often been called into question regarding their position on the issue. Facebook, Google, and Twitter were all asked to review their individual policies and how they handle manipulated media. While these platforms may be willing to scrap manipulated media from their platforms, identifying the media in question is the true headache.
On the positive side, this technology has the potential to revolutionize the media and entertainment industry in providing a powerful tool for creators. Scenes that would typically require millions of dollars to create could be achieved faster with the help of AI. In the tech field, the technology can be used to vastly improve the now popular voice assistants like Siri, Alexa, and Google Assistant – making them truly interactive. Deepfakes can also be used in the recreation of old photos, upscaling and enhancing images – the applications are numerous.
The problem lies in the regulation and application of technology. In a few years, the technology would be so far advanced that depending on the use, the outcomes will either be absolutely magical, or tragic.
Curious about another topic? Submit it here and we’ll start researching!