Skip to content

General Technical Review: Confidential Containers - Incubation#2051

Draft
halcyondude wants to merge 2 commits intocncf:mainfrom
halcyondude:my-coco-incubation-tech-review
Draft

General Technical Review: Confidential Containers - Incubation#2051
halcyondude wants to merge 2 commits intocncf:mainfrom
halcyondude:my-coco-incubation-tech-review

Conversation

@halcyondude
Copy link
Copy Markdown
Contributor

@halcyondude halcyondude commented Feb 25, 2026

This PR contains the General Technical Review for the Confidential Containers project, following the template (general-technical-questions.md), covering Day 0 and Day 1 questions for Incubation:

"human-friendly" reading link:

https://github.com/halcyondude/toc/blob/my-coco-incubation-tech-review/projects/confidential-containers/tech-review/2026-02-24-gtr-coco-incubation.md

There are a few questions remaining (marked with TODO) where input from project maintainers would be appreciated.

Marking as a draft PR to solicit feedback from the TOC and Project Reviews Community.

Further resources:

Feedback heartily welcomed!

Related-to: #1504
Resolves: #2032

Signed-off-by: Matt Young <halcyondude@gmail.com>
@halcyondude halcyondude requested a review from GenPage February 25, 2026 03:37
@halcyondude halcyondude self-assigned this Feb 25, 2026
@halcyondude halcyondude added the review/tech Project Tech Review label Feb 25, 2026
@github-project-automation github-project-automation bot moved this to New - Pending Review in Project Reviews Feb 25, 2026
Signed-off-by: Matt Young <halcyondude@gmail.com>
Copy link
Copy Markdown

@fitzthum fitzthum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made a few notes. Looks good generally.


* Describe how the project is handling certificate rotation and mitigates any issues with certificates.

**TODO (Maintainers):** Please describe the mechanisms for rotating the internal TLS/mTLS certificates used between Trustee, the CDH, and the Attestation Agent.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned below the trustee operator uses cert manager for this. Users may also have their own approach or infrastructure depending on how Trustee is tied into their network. Also, keep in mind that the KBS protocol is designed to be secure even without HTTPS.

While we're here, there are several other places where we can talk about rotation. For example, the attestation token. For this, the token is short-lived (usually 5 minutes) and the guest will automatically try to re-attest when it expires. For individual resources stored in the KBS, rotation is out of scope of Trustee, and should be driven by the owner of those resources. For hardware evidence, revocation is platform specific. Refer to cert chain / collateral documentation for the various hw platforms.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fitzthum would you please point me to the docs you're referring to?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which part? For the hw evidence, you can see something like the AMD VCEK spec, which describes using CRLs for checking the AMD cert chain. Other platforms have their own mechanisms. This sort of flow mainly sits below the CoCo project.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For trustee operator using cert-manager, it's described in this blog - https://confidentialcontainers.org/blog/2026/02/11/deploy-trustee-in-kubernetes/

|[JDCloud](https://www.jdcloud.com)|JoyScale |Beta |End-User / Service Provider | JoyScale leverages CoCo to protect the AI data privacy in the process of the company's business and end user. (For details: huoqifeng1@jd.com)|
|[Kubermatic](https://www.kubermatic.com/)| Kubeone | Beta | Service Provider / Consultancy | Running confidential containers on baremetal kubeone clusters. |

**TODO (Maintainers):** Please provide a brief summary or links to any additional adopter interviews, user surveys, or formal UX research (if any) conducted during the Sandbox phase.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some additional adopters were shared with the toc. These are not listed here due to privacy concerns or pending internal approvals.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, and understood regarding adopters. Who's a good person to follow up re: surveys, UX research, etc?

Copy link
Copy Markdown

@fitzthum fitzthum Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


* How can a rollout or rollback fail? Describe any impact to already running workloads.

**TODO (Maintainers):** Describe any specific failure modes during upgrades/downgrades. For instance, do existing VMs keep running if the host-level `kata-shim` or `containerd` drops connection? Are there state-migration issues with Trustee CRDs during a rollback?
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you mention the runtime can be upgraded via kata. Trustee upgrades are wip, but are in the domain of the Trustee operator and the upcoming Trustee helm chart. @fidencio and @bpradipt can probably give more details here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so similar to the main project, Trustee is planning to deprecate its operator in favor of a Helm chart? I personally think this is great news, replacing a layer of operational complexity with a simpler solution. If you could provide more details or links to resources I'll update this accordingly. Thanks!

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not quite as straightforward as replacing the Trustee operator with the Helm chart, although this is one potential outcome. The Helm chart will be designed a bit differently from the Trustee-operator so that it is better suited to running Trustee inside of confidential containers itself. This may take the place of the Trustee operator, but since the operator is currently used in some sophisticated production environments, we're not going to rush on that. It's possible that we will end up with two options that have different applications, although this isn't ideal for maintenance. We will see what makes the most sense down the road.

Anyway there is some discussion about this here as well as a PR to add a helm chart to Trustee.


* Describe how the project is following and implementing [secure software supply chain best practices](https://project.linuxfoundation.org/hubfs/CNCF\_SSCP\_v1.pdf)

The project has achieved SLSA Build Level 2 (see [blog](https://confidentialcontainers.org/blog/2025/02/17/confidential-containerscoco-and-supply-chain-levels-for-software-artifacts-slsa), automatically generating signed provenance in `in-toto` format via GitHub Actions for components like `kata-containers`, `guest-components`, and `cloud-api-adaptor`.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's also worth noting that supply chain security itself is a use case that is in coco's orbit. Artifacts and reference values are very important in confidential computing. Ultimately, we would like to build confidential containers itself inside of confidential containers.

Copy link
Copy Markdown
Contributor Author

@halcyondude halcyondude Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having reviewed some of the CI workflows and associated docs, I think it would be worthwhile (as a suggestion) for the project to post a followup to the blog from around this time last year (https://confidentialcontainers.org/blog/2025/02/17/confidential-containerscoco-and-supply-chain-levels-for-software-artifacts-slsa). It could cover what the project has done in this domain in the past year leading up to it's Incubation application. It could serve as a valuable case study (and "working example") for other projects around release processes by talking about how CoCo has hardened it's build and release pipelines, produces artifacts, SLSA, etc.

There's an initiative in TAG Operational Resilience focused on curating resources and examples of what's above that I'm sure would welcome connecting with the project (#1849, attn: @krol3).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, a sequel would be cool. I'm not sure how much our work would generalize to other projects, given that it is tied to some confidential computing ideas. Also, not sure about timeline, but let me tag some relevant people. @bpradipt @mythi @Xynnn007


* Describe the project’s resource requirements, including CPU, Network and Memory.

Worker nodes require virtualization support and a recommended minimum of 8GB RAM and 4 CPUs to accommodate the hypervisor/Kata overhead.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that your worker node should also have confidential computing support unless you are using the dev/test runtime.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

review/tech Project Tech Review

Projects

Status: New - Pending Review

Development

Successfully merging this pull request may close these issues.

[Tech Review]: Confidential Containers

4 participants