AI and Social Engineering

Imagine the following scenario. You are the financial controller of a company, and receive a late night call to join a video chat. Upon joining it, you are met by the CFO, CEO and legal counsel, as well as a few unfamiliar individuals. Your CEO requests that you transfer 10 million dollars to an unknown account. As you observe the video chat, you see a discussion happening around an acquisition of a large company, and an intense debate between several of the individuals. To be respectful of the serious nature of the meeting, you proceed to make the financial transfer without interrupting the meeting. An hour later, you receive a call from your CEO asking why 10 million dollars were transferred out of the account. You explain what happened, and they tell you that they were out for dinner for the last 3 hours, and never invited you to a video call. This is the type of seemingly outlandish scenario that is a fast approaching reality thanks to the generative AI models that have evolved over the last several years.

A looming threat which has been on my mind since at least 2020 has been how AI technology will impact social engineering attacks and today it has reached a critical threshold where it is posing an imminent threat. Social engineering is among most prevalent types of cyber attack, and as such, any technological advancements which make this family of attacks more effective has wide reaching repercussions. Using AI to generate text, audio, images and video that are lifelike is a powerful tool that can be used positively, but the opportunity to abuse this technology is present. How does AI make social engineering attacks more of a concern?

For starters, let’s focus on email phishing which is a good illustration of the effects of AI on this type of attack. This is one of the most common ways a security breach starts, and despite major efforts to fight against it, still one of the most widespread types of cyber attack. AI models make it much easier for attackers to craft messages which sound more convincing, by making it trivial to create text which is spelling-error free, sounds like it’s written by a native speaker, and even emulates the writing style of individuals who have their writing publicly available. All of this is just a prompt away with one of the currently available large language models (LLMs). The victim’s ability to discern phishing emails is weakened, because AI helps attackers reduce the number of indicators that something is off, such as awkward phrasing, spelling errors and other signs that are giveaways that the message is not authentic.

Things get worse when one realizes that AI technology not only allows to generate text. There are models for generating audio, images and video as well. This means that attackers can leverage this technology to execute much more advanced social engineering attacks which were not possible in the past. Attackers can leverage AI to generate voice audio, with as little as a few minutes of recordings of an individual speaking which is necessary to train a model to be able to synthesize speech that sounds like them. Similarly, one can use AI models to generate video and images of individuals doing arbitrary things, if they have video or images of said individual. There have already been multiple recorded cases of attackers using AI generated content to extort victims for ransom and this type of attack is still in its infancy. In one recent case, the attacker used AI generated audio that sounded like someone’s daughter and played it to the victim over a phone call, claiming that they kidnapped their daughter, asking for ransom in return for her release.

One limiting factor that currently exists for this type of attack is the ability to generate audio and video in real time, but the technology is advancing quickly and it’s only a matter of time until it’s possible to use a video chat platform, and instead of streaming video and audio feeds, stream AI generated media to the person on the other end. Even now, a crafty attacker can pre-generate voice recordings and even video recordings which can then be used in a social engineering attack. Once attackers can leverage real time synthesis of audio and video and this technology becomes more broadly available, the severity and success of social engineering attacks is likely to increase drastically.

We are already starting to see a number of attacks where malicious actors are generating audio and video, to trick someone into, usually, transferring funds to an attacker. Try to imagine all the situations where we depend on the way someone looks or sounds to verify their identity. We are not prepared for the impending wave of social engineering attacks which allow the attacker to create compelling media. What compounds the problem is that the improvement of sophistication of attacks doesn’t only come from the generation of media, but also the attacker’s ability to ask the AI model about how to modify their strategy. The attacker can use data they gather from attacks they execute, both failed and successful, in order to train a model to improve its efficacy.

Advanced attackers will likely build semi-autonomous agents which can carry out attacks on their behalf and automate actions which would otherwise be performed by individuals, as well as allow them to orchestrate attacks across multiple touch points with individuals, across multiple individuals, on different channels, look up information during interactions in order to come up with better responses and more etc. In other cases, they may hyper focus on a single target and execute a much more targetted, hands-on attack. It’s likely that this type of attack will quickly reach such a level of maturity and sophistication, that the success rate of this family of attacks will skyrocket. Hopefully we will be able to react and preemptively develop defense mechanisms for such attacks.

What Can We Do About It?

It’s easy to see how AI will increase the threat of social engineering attacks, but what can be done to fight back? There is a lot of work to be done in this area, but some suggestions on how to frame this risk follow.

It is clear that it is becoming increasingly difficult both to verify someone’s identity, as well as authenticate the authenticity of information in general. The highest risk systems which would benefit most from bolstering resistance to AI based social engineering attacks are ones which involve control of technologies which can have a global repercussion or result in large financial loss, such as nuclear weaponry, nuclear power plants, financial institutions, biological laboratories and any other organizations or systems which may contain controls which rely on human identity as a control mechanism for decision making. There are infinite hypothetical scenarios that can be enumerated of how a creative individual can leverage AI technology to impersonate an authorized person to hack a system, but that exercise is left to the reader.

Regarding mitigating controls, it is no longer sufficient, and arguably never was, to rely on human judgment as a means of verifying whether an individual is who they claim to be, based on a phone call, video interaction or email. Instead a more robust and reliable mechanism needs to be leveraged. One such mechanism that is available are cryptographic signatures and other cryptography based schemas which allow proving that someone is who they claim to be based on having control of a “secret”, which only they have access to. Of course, this method is only reliable up to the point that the cryptographic material which is used to perform the cryptographic verification is adequately secured, but this is becoming more of a solved problem with the proliferation of Trusted Platform Modules (TPMs) which are available in many end-user devices such as computers and phones, as well as smart cards, which can act as personal Hardware Security Modules (HSMs). More recently, Passkey, the latest iteration of FIDO based protocols has been implemented more broadly and made available for use via password managers, which is a helpful step in the right direction in terms of having asymmetric based cryptography more readily available as an authentication mechanism, which is integrated with existing computer systems, and which can hopefully be leveraged in more scenarios more easily. Additionally, PGP has been a technology we have had available for a long time now, so for those who need a more reliable way to verify identities today, that’s a great option.

One challenge is that the user experience and understanding of cryptographic systems makes it hard to achieve widespread adoption, and while the problem is very clear, it will take time for organizations to be convinced they should spend time on defending against this type of attack and its severity. It is likely we will see many disastrous attacks at scale before adequate measures that are commensurate to the risk are put in place.

Update

About 4 months after this article was published, an eerily similar attack happened in Hong Kong as reported on by CNN

Written on October 3, 2023