One of the most notorious techniques to turn black money into white money is through money laundering. Although different financial institutions adhere to certain laws and regulations to prevent money laundering, it is difficult to stop such actions using conventional methods in this digital age when everything is recorded by financial software.
As a result, to combat money laundering, financial institutions are adopting new technologies such as Machine Learning (ML). But what if, while trying to fix one significant issue, we encounter another, say, an attack on the data amassed by multiple financial institutions to create a clever ML algorithm?
At the 6th International Workshop on Big Data Analytics for Cyber Crime Investigation and Prevention, in Osaka, Japan, we presented our solution on AML (Anti-Money Laundering) operations using AI/ML implemented using Trusted Execution Environment (TEE). More details on the workshop agenda and our paper can be found here.
What is Money Laundering?
The purpose of anti-money laundering is to convert illegal funds into a source of wealth that is recognized as legal. It is a crime often undertaken by terrorists, drug dealers, corrupt politicians, or other organized criminals, which allows them to use illegally obtained funds in the legal economy. Money laundering is a significant part of the problem of organized crime. Money Laundering is comprised of the following stages: -
- Placement: Placing the black money in lawful institutions such that it appears genuine and legal.
- Layering: Disconnecting the illegal funds from the illegal activity that generated them using fake transactions, shell companies, transfers between different jurisdictions and offshore accounts.
- Integration: After traveling all around, the illegal funds finally reach the criminal and can be used again in unfair practices.
According to the United Nations statistics, the estimated value of money laundering worldwide is between 2% and 5% of the world’s GDP. That is $800 billion to $2 trillion laundered annually. So, yes this is worrisome.
Anti-money laundering (or financial AML) relates to the web of laws, rules, and procedures aimed at uncovering efforts to mask illicit funds as legitimate income.
Money laundering is a huge problem that Banks, and financial institutions are facing today. Institutions that fail to prevent money laundering often pay a hefty price in the form of reduced revenues, customer loss, heavy penalties, loss of reputation, etc.
Most Financial Institutions follow a system of rules and procedures targeted at acquiring information concerning their clients and their activities. However, money launderers have produced alternatives to hide their activities which a standard rule-based system might not be capable enough to detect.
It results in non-detection/non-reporting of suspicious transactions, risking non-compliance with the AML laws. This may lead to a loss of money and reputation. Therefore, it is of utmost importance for banks to determine a reliable set of controls, allowing them to spot financial activities and transactions even when the launderers are trying their best to bypass compliance. One of the smartest and most effective ways to deal with this is by using Artificial Intelligence (AI) and Machine Learning (ML).
Why not the Traditional process of Anti-Money Laundering (AML)?
As mentioned in Figure 1, The traditional algorithmic process performs scans over the transactions that match predetermined criteria based on pre-defined rules. Typical rules include whether transaction activity matches previously paired activities, or whether the user “might be a bot.” These rules are classified based on a few categories, of which some are anomalies in behavior, transaction patterns, hidden relationships, substantial risk entities. etc.
Once the scanning is complete an alert is generated based on the rules set and the investigation team of the institution starts manually reviewing the alerts and upon completion marks them either closed if the transactions are processed as a “false positive” (i.e., legal transaction identified as fraudulent) or reported as suspicious.
Figure 1: Traditional Anti-Money Laundering decision making process.
The major drawback of the traditional approach is that, since it is dependent on hard coded rules to mark a transaction as fraudulent, there are high chances that alerts generated using the traditional approach are false positives and to analyze these alerts financial institutions further need a substantial highly trained, and expensive workforce to manage money laundering detection operations. Every alert needs to be carefully studied, as any non-compliance can lead to large financial penalties and reputational damage to institutions.
Given the rapid growth and proliferation of new methods and strategies for money laundering by fraudsters, there is an urgent need to continually assess transactions and external elements to discover new situations. Any delay in the detection method would cause the system to be unable to notice the suspicious transaction, resulting in non-compliance with the regulatory mandated criteria.
Despite so many limitations, even if some new scenarios are identified this necessitates rigorous human labor, yet there is a risk that certain events will be ignored or incompletely captured owing to a lack of comprehensive data analysis.
Despite the banks' best efforts, there are still a lot of questionable transactions that go unnoticed. Therefore, banks are moving away from the old strategy and toward sophisticated methods incorporating artificial intelligence and machine learning (AI/ML) enabled technology for transaction monitoring.
Why do we need Confidential Computing in AML?
Imagine, while aggregating the transactional data to build an accurate and efficient machine learning model from the combined data of different financial institutions, an attacker tries to steal the extremely sensitive transactional data of people around the world.
That would be a disaster and banks, which were struggling to deal with money laundering will also come under scrutiny for not adhering to the EU General Data Protection Regulation (GDPR), one of the world’s most stringent privacy and security laws.
Now, you must be thinking, how can the data get stolen? Although banks have highly protected systems and even the data is encrypted. However, encryption of data at rest in a database on a server and while in transit across public or private networks is not sufficient to protect sensitive information.
The real problem comes when we need to decrypt previously encrypted sensitive data for processing on the machine because the application must see the data to analyze it, then the data in unencrypted form becomes a potentially vulnerable target for the attacker.
Confidential Computing provides an extra layer of security for sensitive data while in use. The technology isolates the sensitive data and code during runtime. Data during processing is protected by RAM encryption and hardware-based technologies.
In confidential computing the processing happens inside a secure enclave which is a dedicated portion of the CPU allocated for memory encryption and protection, using a key unique to the CPU and the program executing in the enclave.
There are various CPU manufacturers and cloud service providers who offer their version of secure enclaves, some of which are Intel® SGX and Intel® TDX, AMD SEV, AWS Nitro Enclaves and Arm TrustZone.
Processing data inside this Trusted Execution Environment (TEE) powered by secured enclaves, the data, even at runtime, is kept secure, as it is completely isolated and is secured to a level that prevents unauthorized access, even if the hypervisor is compromised or an attacker has somehow gained root access privileges.
Confidential AML Workflows
Fortanix implemented a representative AML workflow in Intel® SGX secure enclaves utilizing DC4s_v3 virtual machines offered by Microsoft Azure to evaluate and build an AML model with the ambition of protecting the data at rest, in motion and in use.
The data from the various financial institutions is pulled using a remote SQL client inside the TEE and then encrypted using the AES (Advanced Encryption Standard) encryption algorithm for storing the data in third-party storage services like AWS S3 bucket.
This encrypted data is then decrypted inside the TEE where it undergoes preprocessing and is then pushed to the machine learning model, which is again running in the Intel® SGX secure enclaves. The generated model, model architecture and hyperparameters are then moved to the Fortanix Data security Manager™, which is also protected using Intel® SGX and functions as a hardware security module (HSM), securing the data using AES encryption.
The HSM is a tamper-resistant, physically hardened device with FIPS 140-2 Level 3 certification that provides robust data security by using encryption keys, verifying users by digital signatures, and encrypting and decrypting all data that comes in out of the module. In this way, the model information and parameters can only be accessed by the application running inside the TEE via trusted CA (Certification Authority) authentication.
This will help to prevent model extraction attacks as the model will be sealed off from the host system. At the same time, the encryption mechanism makes it more difficult for attackers to reverse engineer the models.
Fig 2: AML Workflow
Model parameters and application secrets are only served from Fortanix Data Security Manager after certificated authentication of the TEE at runtime, where only trusted actors (financial institutions) can access AI/ML detection systems to perform inference over data and to flag money laundering transactions. Then again sent to a secure server, where only trusted actors (Financial Institutions) can access for inference and to flag money laundering transactions.
Confidential Computing is a great solution to complex problems such as money-laundering those financial institutions face. This approach enables banks and other financial services organizations to collaborate while keeping sensitive data protected within a TEE in accordance with data privacy and compliance requirements.