Ready to build your next game-changing platform?
Discover how bespoke engineering can transform your complex challenges.
The Challenge: Zero Trust in a Distributed Environment
The industry standard for securing high-performance compute environments, especially for LLM and AI workloads, is now Confidential Computing.
Standard encryption protects data at rest (storage) and in transit (network). However, the moment data is loaded into memory for processing, it is typically vulnerable. For our client, this gap was unacceptable. They needed a solution where the CPU, RAM, GPU, VRAM, NVLink, and PCIe bus communications were cryptographically protected, preventing unauthorized observation even from the host OS or a malicious administrator.
We identified NVIDIA Confidential Computing as the appropriate solution. By leveraging hardware-based Trusted Execution Environments (TEEs), we could ensure that the memory, CPU state, and GPU execution remained isolated from the host.
Phase 1: Proof of Concept with AMD SEV-SNP
Because of the complexity of the stack, we needed to validate every attack vector. Our journey began with deep technical sessions with NVIDIA solution architects to validate our approach to GPU attestation and encryption layers.
We started by configuring a Confidential Virtual Machine (CVM) on an AMD-based server using AMD SEV-SNP with KVM. This was not a "plug-and-play" operation; it required significant updates and patching of the Linux kernel on Ubuntu to a specific version that supported the necessary confidential computing features.
This phase confirmed that we could successfully configure a CVM and verify GPU attestation against NVIDIA’s documentation, giving us the green light to move to production hardware.
Phase 2: Scaling to Supermicro H100 Clusters
The production environment was significantly more powerful. We moved to configuring four Supermicro GPU SuperServer SYS-821GE-TNHR units. These are beasts of computation, designed for LLM training and inference, each equipped with eight NVIDIA H100 GPUs connected via SXM.
Enabling Confidential Computing on this specific architecture presented unique hurdles. We encountered boot issues when enabling Intel TDX (Trusted Domain Extensions).
Razor worked closely with engineers from both Intel and Supermicro to troubleshoot the problem. We discovered the issue lay in the firmware; by installing the correct firmware versions for both the BIOS and the GPUs, we successfully enabled the host server for Confidential Computing.
Automating Trust with Go
Validating a complex hardware stack manually is neither scalable nor secure. To streamline this, Razor developed a custom Go-based application to perform system-level checks on all components required for Confidential Computing.
This tool acts as the gatekeeper for the distributed cluster:
- Validation: It generates a detailed host validation report and securely transmits it to the client’s control servers.
- Deployment: If, and only if, validation succeeds, the system automatically downloads and launches a pre-built Confidential Virtual Machine.
- Attestation: Inside the CVM, a secure service provides attestation endpoints. This allows the client to verify the integrity of the CVM itself, while GPU attestation is performed against NVIDIA services to confirm the trusted state of each H100 GPU.
The Result: Verified, Encrypted AI
By rigorously implementing and testing these layers, the client gained the ability to run workloads inside encrypted Confidential Virtual Machines backed by verified GPU attestation.
This architecture ensures that no host operator or external actor can access or tamper with customer data. The client can now deploy AI workloads to their distributed cluster with total confidence, knowing that the environment is cryptographically isolated and verified before a single byte of data is processed.
Next Steps
Are you looking to implement Confidential Computing for your AI infrastructure? Get in touch to see how we can help secure your compute stack.
More from our team
Keep Reading

The Era of Agentic Token Economics
Agentic AI is brilliant — but it is a token-burning furnace. Discover how elite engineering teams are taming runaway AI costs with custom SLMs, intelligent routing, and hard AI FinOps budgets.

The Connected Organisation - Insight Beyond the Shop Floor
Discover how to bridge the gap between the shop floor and the boardroom. Learn how Agentic AI transforms disconnected manufacturing data into a unified, actionable connected organisation.

The Silver Tsunami - Fighting the Tide
Discover how to capture and operationalise tribal knowledge before your most experienced workers retire. Learn how Agentic AI helps manufacturers combat the Silver Tsunami.
Diving deeper into Razor
Your Next Move

AI Activation Plus
Uniting comprehensive strategic understanding, clear roadmap planning, and immediate action. AI Activation Plus delivers a rigorous readiness assessment and immediately builds a working Proof of Value.

AI Activation
Bridge the gap between AI ambition and operational reality. Rapidly identify high-value opportunities and leave with a clear, prioritised roadmap you can act on.

UX & Brand Design Services
Elevate your digital presence with Razor\
