Q: Can attestation prove which model is serving my requests?

Short answer: Yes. It’s actually one of the more significant capabilities of attestation-based security for AI deployments. When an AI model is packaged for deployment in a confidential computing environment, its identity consists of a cryptographic measurement of the model binary, weights and configuration. It’s included in the workload measurement that gets captured during attestation. This means when the report confirms that a TEE is running, it also confirms that a specific, identified model is running inside that TEE. For enterprises using third-party AI services, this is make-or-break. Without attestation, you have to trust that the service provider is actually running the model they claim to be running. With attestation, you can verify it. The attestation report has the model’s measurement, which you can compare against the measurement for the model you agreed to use. For model owners deploying proprietary models into enterprise environments, the same principle works in reverse: it confirms that the model being executed is the genuine, unmodified version distributed and that it hasn't been tampered with between distribution and deployment.

Question 1

What is attestation in AI?

Accepted Answer

Attestation in AI is the process of cryptographically verifying that an AI workload is running in a genuine, unmodified and secure execution environment before sensitive data or encryption keys are released to it. It's what allows enterprises, model owners and infrastructure operators to establish verifiable trust in a computing environment, independent of organizational policy or contractual assurance. It’s based on hardware-signed evidence that can be independently verified.

In practice, attestation works by having the hardware itself generate a cryptographically signed statement that describes the state of the execution environment: the processor identity, firmware versions, the software stack loaded into the Trusted Execution Environment (TEE), and a measurement of the code that's running or about to run. That report is signed using a key embedded in the silicon at manufacture, which means only genuine hardware from the vendor can produce a valid report.

For AI workloads, attestation-based security serves two critical functions. First, it gives enterprises cryptographic proof that sensitive data fed into an AI model will be processed inside a genuine, hardware-isolated environment (not a simulated or compromised one). Second, it gives model owners proof that their proprietary model weights and architecture will be protected from extraction during inference. Without attestation, both of these trust claims rest on policies and assumptions. With it, they have hardware-rooted cryptographic evidence.

Question 2

What is composite attestation?

Accepted Answer

Composite attestation is the verification of both the CPU and GPU TEEs within a single chain of trust that spans the complete execution stack where AI inference occurs.

Standard attestation, as implemented in CPU-only confidential computing environments like Intel TDX or AMD SEV-SNP, verifies the integrity of the CPU execution environment. This is meaningful and important, but for AI workloads, it only covers part of the picture. AI inference is primarily a GPU operation: model weights, input data, intermediate computations, and outputs all reside in GPU memory during inference. CPU attestation doesn’t do anything about what's happening there.

Composite GPU attestation binds the CPU and GPU attestation reports into a single verified chain, in which the CPU TEE's measurement includes a cryptographic reference to the GPU's attested state, and the entire composite is verified as a unit before any workload proceeds. The result is a single attestation report that covers the full hardware stack where AI inference occurs (CPU and GPU together) rather than just the orchestration layer on the CPU side.

For security architects and compliance engineers designing AI infrastructure, composite attestation is the appropriate baseline for any deployment involving sensitive data or proprietary model IP. CPU-only attestation leaves the most sensitive part of the execution environment outside the verified boundary, but composite attestation brings it inside.

Question 3

How does GPU attestation work?

Accepted Answer

GPU attestation is like CPU attestation: the hardware generates a signed, cryptographically verified report specific to the GPU's execution environment.

When an NVIDIA confidential computing-capable GPU (Hopper, Blackwell, or Vera Rubin architecture) is initialized for a confidential workload, the GPU generates an attestation report that captures its hardware identity, firmware version, driver state and the security configuration of its memory isolation mechanisms. This report is signed with a key unique to that specific piece of genuine NVIDIA hardware.

That signed report is submitted to a verifier (either NVIDIA's Remote Attestation Service (NRAS) or a local verification service for air-gapped environments), which checks it against measurements stored in NVIDIA's Reference Integrity Manifest (RIM) service. If the GPU's firmware and configuration match what’s expected, the verification succeeds. A signed attestation token is then returned to confirm that the GPU is genuine, unmodified, and operating in a secure confidential computing mode.

A composite GPU attestation flow takes it a step further by coordinating with CPU TEE attestation, combining the results into a single chain of trust.

Question 4

Can attestation prove which model is serving my requests?

Accepted Answer

Short answer: Yes. It&rsquo;s actually one of the more significant capabilities of attestation-based security for AI deployments. When an AI model is packaged for deployment in a confidential computing environment, its identity consists of a cryptographic measurement of the model binary, weights and configuration. It&rsquo;s included in the workload measurement that gets captured during attestation. This means when the report confirms that a TEE is running, it also confirms that a specific, identified model is running inside that TEE. For enterprises using third-party AI services, this is make-or-break. Without attestation, you have to trust that the service provider is actually running the model they claim to be running. With attestation, you can verify it. The attestation report has the model&rsquo;s measurement, which you can compare against the measurement for the model you agreed to use. For model owners deploying proprietary models into enterprise environments, the same principle works in reverse: it confirms that the model being executed is the genuine, unmodified version distributed and that it hasn't been tampered with between distribution and deployment.

Question 5

What does hardware-signed proof mean?

Accepted Answer

Hardware-signed proof is the cryptographic attestation report that’s signed by a key embedded in the processor silicon at the time of manufacture.

The origin of the signing key is fundamental to the security guarantee. Software can be compromised, modified or spoofed by a sufficiently capable attacker, but a key embedded in silicon can’t be extracted or replicated without physically destroying the chip. When an attestation report is signed by that hardware key, the signature proves two things simultaneously: that the report was generated by genuine hardware from the specified manufacturer and that it hasn’t been modified since it was generated.

For a hardware attestation of AI workload, this means you don’t have to trust any software layer, administrator or service provider. That entire “trust chain” starts in the silicon itself. In this sense, hardware-signed proof could be considered the foundation for zero-trust AI infrastructure, rather than software attestation mechanisms that, in principle, could be circumvented.

Question 6

What is the difference between NRAS and composite attestation?

Accepted Answer

NRAS (NVIDIA Remote Attestation Service) is the verification service that validates GPU attestation reports. It's one component of the composite attestation process, not a synonym for it.

When a confidential computing-capable NVIDIA GPU generates an attestation report, that report must be verified against “known-good measurements”: the expected firmware versions, driver states, and security configurations for genuine NVIDIA hardware. NRAS performs that verification. If the GPU's state matches the expected measurements, NRAS returns a signed attestation token confirming the GPU is genuine and unmodified.

Composite attestation is the broader architecture that incorporates NRAS GPU verification as one component alongside CPU TEE attestation. While NRAS verifies the GPU side of the execution environment in isolation, composite attestation unifies the GPU attestation result with the CPU attestation, which verifies the orchestration environment. Composite attestation combines both into a single statement that covers the complete AI execution stack.

Question 7

Can attestation data be falsified?

Accepted Answer

In this sense, no. Attestation reports are signed by hardware keys embedded in the actual chip or processor silicon. Forging a valid attestation report would require either possessing the private signing key (which never leaves the hardware) or finding a cryptographic weakness in the signing algorithm itself. Neither is a realistic scenario in current hardware attestation implementations using standard cryptographic algorithms.

What adversaries could do, in theory, is attempt to reuse a previously valid attestation report, submitting a genuine report from a legitimate device to gain trust for a different, potentially compromised device. But sound attestation protocols protect against this: the relying party includes a cryptographic nonce (a one-time random number) in the attestation request, and the hardware must incorporate it into the signed report. A replayed report from a prior session wouldn’t contain the current nonce and would be rejected.

The more practical concern for most organizations is ensuring that the verification process is implemented correctly from the beginning and that expected measurements are accurately maintained. An attestation system is only as trustworthy as the reference values it's comparing against. If the expected workload measurements haven't been properly defined and maintained, a valid attestation against incorrect reference values creates false assurance. Establishing and maintaining accurate measurement baselines is critical to any attestation-based security program.

Question 8

What does successful attestation actually prove?

Accepted Answer

Successful attestation proves, with cryptographic certainty, specific claims about an execution environment at a specific point in time.

It proves that the hardware generating the attestation report is a real device from the specified manufacturer, using a signing key that was physically embedded in silicon and hasn’t been extracted or replicated.

It proves that the hardware's firmware and security configuration match the expected values, meaning the firmware hasn't been modified, the TEE is operating in the expected security mode, and no known vulnerabilities have been introduced.

It proves that the software loaded into the TEE matches its own expected measurement, and that the intended code is actually running unmodified inside the verified hardware environment.

Meanwhile, composite attestation proves all of the above simultaneously for both the CPU and GPU execution environments, covering the entire AI inference stack.

Content

AI Attestation