How much should we be worried about deepfakes? What sort of threat do they pose to digital identity verification and the biometric technology we so depend on, and are there ways to combat this threat?
The deepfake threat
Deepfakes refer to manipulated videos or other digital representations produced by sophisticated artificial intelligence (AI), which produce manufactured images and sounds that appear real. While video deepfakes are arguably the most common, audio deepfakes are also growing in popularity.
You’ve probably seen some of the most important deepfakes that are in the public domain right now, especially those that manipulate existing images of Obama and Tom cruise. However, while this technology may seem playful on the surface, we should not overlook its potential dark side.
As we saw a few years ago, criminals used this technology to imitate the voice of a general manager and demand a fraudulent transfer of € 220,000. This is just one example, but high quality fraudulent deepfakes are now being used much more regularly and the quality of technology is constantly improving, especially with access to images, online videos and social media creating more sources to tap into.
Impact on biometric identities
Now think about this threat against the backdrop of the growing popularity of biometric technology and digital identity verification. Some government agencies today use voice recognition as proof of identity, while banks are using voice and facial recognition to register new users and facilitate online banking.
For example, HSBC recently revealed that telephone bank fraud has been reduced by 50% since the introduction of a biometric security system that authenticates customers by voice, believing that an additional layer of security has prevented £ 249million from being released. British customer money to fall into the hands of criminals in the past year.
However, as the unfortunate deepfake scam shown above shows, cybercriminals have started using technology to commit fraud and there are now concerns that the technology can and will be used to develop fake biometric identifiers to bypass prevention solutions. biometric-based fraud.
So, an obvious question is whether deepfakes are powerful enough to trick the biometric solutions that institutions such as banks and governments become so dependent on.
The current limits of deepfake technology
The answer is: not currently, but we should still take steps to protect ourselves. I know that is not a very satisfactory answer, but it is probably the dose of reality that this debate needs.
First of all, we need to think about how biometric authentication works. Take the example of voice biometrics: a good fake voice (even just a good imitator) can be enough to fool a human. However, voice biometric software is much better at identifying differences that the human ear cannot or chooses to ignore, meaning voice biometric identification can help prevent fraud if the identity is verified by compared to the voice. Even the so-called deep fakes create a bad copy of someone’s voice when analyzed digitally; they make some pretty convincing cameos, especially when combined with video, but again these are bad knockoffs on a digital level.
Other than that, the ability of deepfakes to bypass biometrics-based solutions will ultimately depend on the type of liveliness detection built into the solution. Activity detection identifies whether the user is a real person, and the most basic forms of activity detection require the user to blink, move their eyes, open their mouth, or nod their head.
However, these simple forms of liveliness detection can be faked with deepfakes, as has recently been done. seen in China, where cybercriminals bought high-quality facial images on the black market and used an app to manipulate the images and create deepfake videos that looked like faces were blinking, nodding, or opening their mouths .
They then used a special phone to hijack the mobile camera typically used to perform facial recognition checks, which allowed them to trick the tax billing system into accepting predefined deepfake videos and was good enough to beat. vividness detection control, even if no one was standing in front of the camera.
Fortunately, there is currently no known deepfake-based system that can generate a synthetic response that resembles the user and speaks random words or performs random movements correctly with exact audiovisual synchronization within the limited time available. If it were possible to build such a deepfake, it would require a tremendous amount of work for each application, making large-scale fraud impossible.
However, that does not necessarily mean that the technology will not mature, and it leads us to the solution that will allow us to perpetuate our biometric identities: multiple factors.
Fighting deepfakes: multiple factors
In any situation that uses a biometric solution, especially when it is used to prove identity, there are several factors that should be used. This is ultimately due to the fact that a combination of, say, voice, face, and a PIN is highly secure as only one factor may be possible to tamper with, but tamper with all three in the same. case. is practically impossible. Therefore, to secure our biometric identities against a deepfake threat, we must have the agility to evolve and add more or different factors as threats change and become more sophisticated or more available.
An additional factor that is very difficult to simulate is time (for example, having to provide an answer to a dynamic question that is unique at the time it is asked). This may involve speaking a unique server-side generated word or number that cannot be predicted, as well as making a specific movement or facial expression on request at the time of verification.
An action-based factor (in addition to voice and facial biometrics) of “what you do” and – more importantly – “what you are told to do” is incredibly difficult to fake. It is unlikely that a deepfake attack that passes the biometric check will be able to replicate a required action, as such predetermination is not possible with the processing power available today. But what would that look like in action?
Imagine that you are a bank and a customer calls to transfer a large sum of money, a situation in which it is vital to authenticate and transactionally link that is on the other end of the phone. By forcing the consumer to read a unique alphanumeric sequence, based on a particular transaction and generated at that point in time from the transaction and associated metadata, the result would be a combination of liveness detection, biometric voice verification, and proof of linkage. of that person at a specific time and event.
Importantly, even if there was a high-quality impersonation attempt via a deepfake, the unique server-side generated statement means that a pre-prepared deepfake would not be useful or able to adapt quickly enough. .