Why Your Most Sensitive Data Can’t Use the Best Frontier Model Yet

The same conversation has been playing out across regulated enterprises for the past couple of years. A bank, insurer, or hospital is excited about a specific AI use case. Their data scientists have benchmarked a proprietary model that beats everything else they’ve tried. Procurement is moving. Then security gets involved, and someone asks the question that ends most of these conversations: Where does the data actually go when this model is running?

The technical answer doesn’t go over well. The best frontier models today are SaaS-only, so the data has to travel to the model, and once it gets there, it sits in plaintext in the server's memory, where the cloud admin, the hypervisor, and anyone reasonably motivated can read it.

Fortanix is working closely with Dell Technologies to address these challenges. The rest of this post walks through what was built, why the architecture is shaped the way it is, and where it lands for a security or infrastructure leader trying to make a real decision.

What is actually broken?

Most readers will assume the issue here is just “data exposure during inference.” That’s part of it, but on its own, it would be a manageable problem. What makes it hard to solve is that there are three things going wrong at the same time, and any one of them is enough to stall a project.

The first is the data-in-use gap. Storage encryption protects data on disk. TLS protects it on the wire. While a workload is actually running, neither helps. The data has to be plaintext in CPU and GPU memory for inference to happen. So do the model weights. So do the activations flowing through every layer. The host operating system can read all of this. So can the hypervisor. Memory scraping, cold boot attacks, hypervisor escape are working exploits. For a bank running fraud detection on live transaction data, or a clinical workload on patient records, that exposure is enough to fail security review on its own.

The second problem is the IP issue with frontier AI model providers, which doesn’t get talked about as much but kills as many deals. The frontier models that enterprises actually want took years and a lot of money to train and build. If the weights leak, the vendor no longer has a business. So they hold the weights inside their own infrastructure, which means the only way to use them is to send data to their hosted endpoint, which puts the customer right back at the residency problem they were trying to avoid. The customer needs the model on-prem, but the vendor won’t put it there, and the deal dies

The last problem is regulation. The EU AI Act, DORA, and most of the residency frameworks have similar requirements. Telling an auditor your data was protected is no longer enough on its own. They want something they can pull and verify themselves.

These three things compound. Even if you fix the data-in-use exposure, you still don’t have the model. Even if you get the model on-prem, you still need the audit evidence. Solving one without the others doesn’t unblock the project.

Dell AI Factory with NVIDIA and Fortanix Confidential AI

The infrastructure foundation comes from Dell. With Dell AI Factory, Dell takes a comprehensive approach aligned with customer use cases and objectives. The solution combines all necessary infrastructure components – Dell PowerEdge accelerated compute, Dell PowerSwitch scalable networking and Dell PowerScale low-latency storage – combined with necessary operating environments and AI and MLOps tools and frameworks, all validated as a full-stack solution and delivered via Dell Automation Platform. This provides a complete production ready platform that is a foundation to serve a wide range of use cases. The solution supports NVIDIA GPUs across the Hopper (H100, H200) and Blackwell (B200, RTX Pro 6000) product lines. On the CPU side, Intel or AMD, depending on the platform.

The reason this hardware generation matters, and why none of this works without it, is that both the CPU and the GPU can now enforce a Trusted Execution Environment (TEE). A TEE is an encrypted region of memory that the host operating system can’t see into, and the hypervisor can’t either. Even with root access, the contents stay opaque. NVIDIA brought this capability to the GPU with Hopper and Blackwell. Before that, you could protect data on the CPU side and watch it get exposed at the moment it crossed onto the GPU, which made the whole exercise pointless for AI workloads.

Fortanix sits on top of this infrastructure as the confidential compute control plane. The Fortanix Confidential AI Solution consists of two products with specific jobs.

The Fortanix Confidential Computing Manager (CCM) handles attestation verification and policy enforcement. Before any workload runs, CCM verifies that the CPU TEE is in a known-good state, the GPU TEE is in a known-good state, the firmware versions match what’s been approved, and the binary inside the enclave is the one that’s been signed off on. It checks all of this together, as a single chain of trust covering both the CPU and the GPU called as Composite Attestation.
Fortanix Data Security Manager (DSM) is an FIPS 140-2 Level 3 HSM-certified KMS that stores the keys for model weights and sensitive data. The non-obvious thing DSM does is that it won’t release a key until CCM tells it the platform has passed attestation. So, if anything is wrong with the environment, bad firmware, an unsigned binary, hardware reporting something off, the keys never come out, and nothing decrypts. The workload just sits there, inert, until somebody investigates.

Both CCM and DSM are available as SaaS or on-prem deployment options to meet the sovereign requirements.

Overview-of-Fortanix-confidential-ai

What an inference request actually looks like

The order of operations is what makes the security model work, so it’s worth tracing through.

A confidential VM is provisioned on the Dell AI Factory infrastructure. When the CPU is in TDX or SEV-SNP active, the NVIDIA Hopper or Blackwell GPU comes up in Confidential Computing mode. Fortanix CCM component pushes the measurement information mode, the NVIDIA Hopper or Blackwell GPU operates firmware versions, and the identity of the binary inside the enclave. CCM checks all of that against the policy in place.

If anything fails to check, that’s the end of it. No keys, no decryption, no inference. If everything passes, CCM signals (Application Certificate) Fortanix DSM. DSM releases the keys, but only into the verified enclave. The workload uses the keys to decrypt the model weights from there, and inference runs as it normally would. Encrypted requests come in, get decrypted inside the enclave, run through the model with all activations protected in GPU memory, and the response goes back out. The host operating system has no useful visibility for the duration. Same for the hypervisor. Same for whoever administers the cluster.

A scenario that’s probably closer to home than it looks

A tier-one bank wants to deploy a fraud detection model across its retail and commercial transaction flows. The model comes from a specialist vendor whose business is essentially this one model, trained on industry-wide fraud patterns over many years, and it’s noticeably better than what the bank’s internal team can build on their own data alone. To use it well, the bank needs to run inference against real-time transaction data. PII, account numbers, merchant info, behavioral signals, all of it has to be in scope.

Sending that data to the vendor’s hosted endpoint isn’t going to happen. The bank’s residency rules disqualify it. The privacy team rules it out. So if the deal happens at all, the model is running on bank infrastructure.

The vendor has a matching problem. If their weights leak, a competitor can stand up an equivalent product in a few weeks, and the vendor is out of business.

Fortanix Confidential AI on Dell AI Factory with NVIDIA is what closes the deal. The vendor packages the model as an encrypted artifact and ships it. The bank deploys it in a confidential VM on a Dell PowerEdge server with a confidential CPU and GPU, in the bank’s own data center. CCM does composite attestation. DSM releases the decryption keys into the TEE only after attestation passes. Weights decrypt inside the enclave. Transaction data flows in encrypted, gets scored inside the enclave, and fraud signals come back out. Bank data never leaves the secure enclave. Vendor weights never appear in plaintext anywhere outside the TEE.

The same setup applies to many other deployments that have been stuck for the same reasons. Clinical AI on patient records. GenAI assistants on classified workloads. Pharma collaborations where two parties contribute IP that they each, by design, can’t see the other’s portion of. The shape of the problem is always the same: two parties, both with something to protect from the other, and trust has to come from cryptography rather than from a contract clause.

What does this change mean for the business?

A few things shift once this architecture is in place.

Frontier models that weren’t available on-prem before becoming deployable giving enterprises a clear path to adopting best-in-class AI with confidence to truly enable enterprise AI innovation and productivity. AI Labs that have held back from on-prem distribution because they couldn’t protect their weights now have a place to ship them, thus reaching new markets and growing their business. Verified TEEs provide the IP protection they need, making the models available to enterprises that need on-prem deployment for compliance reasons.

The insider risk model changes. Even an administrator with root access on the host can’t observe what’s happening inside the enclave. Inputs, outputs, and intermediate state all stay inside the TEE. This isn’t another layer of access control on top of existing controls. It’s a different threat model.

Where this leaves a security or infrastructure leader

Confidential AI matters now because it's the only way to run a specific class of inference at all. The kind where the data can't move, and the model can't ship in the clear. Until recently, that combination meant the project didn't happen. Now it can.

If you have a use case stuck in this pattern with regulated data, a proprietary model, a vendor that won't ship the question is whether it looks different when the guarantees come from silicon instead of a contract clause. In most cases, it does.

Dell, and Fortanix have published a joint technical white paper covering the architecture and healthcare reference deployment.

Why Your Most Sensitive Data Can’t Use the Best Frontier Model Yet; A Security and Compliance Problem