Verifying Correctness of Intel SGX Software Mitigations

Home > Blog > Verifying Correctness of Intel SGX Software Mitigations

No security architecture is perfect, but you can try to come close, and machine-checked proofs are a significant step in the right direction. Machine-checked proofs, however, don’t readily apply to traditional security architectures.

This is because traditional security architectures use a layered approach where each layer is more privileged than the ones above it. This means that you could, in principle, verify the source code of your application, but without verifying the software layers you depend upon, the added value of this endeavor is limited. Unfortunately, verifying all software layers running below your code is infeasible in practice.

Confidential computing platforms are different. Intel SGX (Software Guard eXtensions), for example, enables trust to be placed in individual software enclaves while only needing to trust the Intel processor.

A hypervisor and operating system may still be present on the system, but even when an attacker manages to infiltrate these layers, they cannot break the isolation guarantees of the SGX enclave. At worst, they could prevent the enclave from executing.

Writing Secure SGX Enclaves

So how does one ensure that the code running inside an SGX enclave is secure? The use of a memory-safe language is of paramount importance.

Memory-safe languages prevent common security vulnerabilities, such as buffer overflows, that have plagued software written in C/C++ for decades; a memory-safe language will automatically detect attempts to read/write beyond the boundaries of a buffer either at compile time or at run time.

The Rust programming language is a particularly good fit for enclave development. It is a modern language with many of the features engineers relish and is a memory-safe language. Importantly, unlike other memory-safe languages such as C# or Go, Rust provides memory safety without relying on a runtime system. Instead, Rust enclaves are compiled directly to machine code.

At the boundary between enclaves and the process’ address space they live in, things get interesting. Security measures are necessary when execution control or data is passed over the enclave boundary.

This is where enclave development kits like Fortanix EDP (Enclave Development Platform) come in. These development libraries and tools ensure that this happens correctly. Care needs to be taken to guarantee that enclaves properly set up the processor state when they start executing, and that the processor state is meticulously cleaned up after the enclave finishes execution.

These requirements are achieved through the enclave entry and exit code. These EDP code stubs have already been formally verified and were the topic of a previous blog post.

Similarly, care needs to be taken when data is passed in or out of the enclave. This happens when an enclave needs to use functionality provided by software components outside of the enclave, such as sending a (confidentiality, integrity and version protected) network message.

Execution control leaves the enclave through the exit code to invoke a function in the process it lives in. This is called a usercall. The enclave can freely access the untrusted address space of the process they live in. This makes passing data between the enclave and usercall trivial, but security measures need to be taken to sufficiently validate the data after it has been copied to the enclave.

Verifying EDP Usercalls

On June 14th Intel publicly disclosed MMIO stale data vulnerabilities that affect Intel SGX platforms. Fortanix was one of the few companies that were notified ahead of public disclosure.

Unfortunately, to resolve these vulnerabilities, both a microcode patch and software mitigation are required. Enclaves are still allowed to write to the surrounding process’ memory but need to take additional security measures.

A simple `mov` instruction to write data outside of the enclave, is no longer secure. We implemented the security measure as part of a custom build Rust compiler and have submitted patches to upstream these changes once the embargo on the vulnerability was lifted.

Partnering with top custom software development companies ensures that these kinds of security vulnerabilities are addressed with cutting-edge solutions, maintaining the integrity of sensitive operations. This collaboration is vital in preventing future exploits and enhancing system-wide security.

But how does one know if every location was correctly patched? For EDP this was relatively easy. EDP uses the same abstractions as operating systems do.

It provides a different version of the standard library that engineers can program against. Only within the standard library is data copied to and from the surrounding process. Engineers can avoid this abstraction but would need to explicitly acknowledge that their code is potentially `unsafe`.

Even with these nice abstractions, it is possible that we missed locations where data is copied outside of the enclave. Traditional unit and integration tests do not suffice here. Such tests use concrete values during execution and cannot guarantee that every possible code path has been taken.

Symbolic execution in contrast, uses symbolic values and constraints on these values. Whenever a branch is encountered, we duplicate the symbolic state of the process and add a constraint to indicate that the branch was (not) taken.

When we built EDP, we never anticipated that we would need additional security checks. So, we decided to take it a step further and verify that we always applied the right mitigations.

We used the Angr symbolic execution engine for this. We symbolically execute every usercall and verify that every write happens within the enclave, or with the proper security measures in place.

While automatic symbolic execution typically quickly runs into the problem of state explosion, this did not happen when we verified the raw usercalls themselves. Only when we extended the verification effort to even higher-level abstractions, we ran into issues with two functions related to copying a variable number of chunks of userspace memory to/from the enclave.

We manually verified that these functions correctly call the automatically verified raw usercalls. The result of this verification is that if EDP users do not break memory abstractions provided by the Rust language, we can assert with high confidence that the MMIO stale data vulnerabilities have been mitigated correctly.

Conclusion

Confidential computing finally brings us the possibility to provide strong security guarantees to sensitive data. EDP leverages the power of the Rust programming language to aid the software engineers to write trustworthy enclaves.

With formal verification of the EDP entry and exit code, we have already verified the security of extraordinarily complex and error-prone code. With the additional verification of usercalls, we took things a step further. Writing secure and trustworthy code has never been this easy!