From a533e4fb2a1b271fdb98eea5272840226c43ee51 Mon Sep 17 00:00:00 2001 From: Chalindu Kodikara Date: Mon, 18 May 2026 20:39:08 +0530 Subject: [PATCH] feat: add instructions for multi-cluster setup Signed-off-by: Chalindu Kodikara --- observability-tracing-aws-xray/README.md | 287 +++++++++++++++++++---- 1 file changed, 243 insertions(+), 44 deletions(-) diff --git a/observability-tracing-aws-xray/README.md b/observability-tracing-aws-xray/README.md index b9d894a..2a74e07 100644 --- a/observability-tracing-aws-xray/README.md +++ b/observability-tracing-aws-xray/README.md @@ -1,14 +1,13 @@ # Observability Tracing Module for AWS X-Ray -The **Observability Tracing Module for AWS X-Ray** collects application -traces via an **OpenTelemetry Collector** and stores them in **AWS X-Ray**. -A Go adapter service implements the **OpenChoreo Tracing Adapter API** to -query traces back from X-Ray for the OpenChoreo Observer. +| | | +| ------------- | --------- | +| Code coverage | [![Codecov](https://codecov.io/gh/openchoreo/community-modules/branch/main/graph/badge.svg?component=observability_tracing_aws_xray)](https://codecov.io/gh/openchoreo/community-modules) | This module supports both: -- **EKS clusters** using **EKS Pod Identity** or IRSA, recommended for production. -- **Non-EKS Kubernetes clusters** such as **k3d**, **kind**, or Kubernetes +- **EKS clusters** using **EKS Pod Identity**. +- **Non-EKS Kubernetes clusters** such as k3d, kind, or Kubernetes running outside AWS, using static AWS credentials. ## Table of contents @@ -67,7 +66,7 @@ Choose the deployment topology first, then choose the AWS authentication model f | --- | --- | --- | --- | | Single cluster | The OpenChoreo cluster where the observability plane and workloads run together. | Deploys the adapter and OpenTelemetry collector. | Defaults. | | Observability plane cluster | The cluster where the OpenChoreo observability plane is installed. | Deploys only the X-Ray Tracing Adapter. | `opentelemetry-collector.enabled=false` | -| Data-plane / workflow-plane cluster | Each cluster that runs OpenChoreo workloads. | Deploys only the OpenTelemetry collector. | `adapter.enabled=false` | +| Data-plane cluster | Each cluster that runs OpenChoreo workloads. | Deploys only the OpenTelemetry collector. | `adapter.enabled=false` | For one OpenChoreo installation, keep these values identical across all participating clusters: @@ -292,9 +291,101 @@ Use the following trust policy for both roles when using EKS Pod Identity: } ``` +To create the roles and attach policies using the AWS CLI: + +```bash +export AWS_ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text) + +POD_IDENTITY_TRUST_POLICY='{ + "Version": "2012-10-17", + "Statement": [ + { + "Effect": "Allow", + "Principal": { + "Service": "pods.eks.amazonaws.com" + }, + "Action": [ + "sts:AssumeRole", + "sts:TagSession" + ] + } + ] +}' + +# Create the adapter role +aws iam create-role \ + --role-name OpenChoreoXRayTracingRoleForAdapter \ + --assume-role-policy-document "$POD_IDENTITY_TRUST_POLICY" + +# Create the adapter policy +aws iam create-policy \ + --policy-name OpenChoreoXRayTracingAdapterPolicy \ + --policy-document "$(cat < + +aws eks create-pod-identity-association \ + --cluster-name "$EKS_CLUSTER_NAME" \ + --namespace "$NS" \ + --service-account tracing-adapter-aws-xray \ + --role-arn "$ADAPTER_ROLE_ARN" + +aws eks create-pod-identity-association \ + --cluster-name "$EKS_CLUSTER_NAME" \ + --namespace "$NS" \ + --service-account opentelemetry-collector \ + --role-arn "$COLLECTOR_ROLE_ARN" +``` + +**Observability plane cluster** — adapter only: + +```bash +export EKS_CLUSTER_NAME= + +aws eks create-pod-identity-association \ + --cluster-name "$EKS_CLUSTER_NAME" \ + --namespace "$NS" \ + --service-account tracing-adapter-aws-xray \ + --role-arn "$ADAPTER_ROLE_ARN" +``` + +**Each data-plane cluster** — OpenTelemetry collector only. Repeat for each cluster: + +```bash +export EKS_CLUSTER_NAME= + +aws eks create-pod-identity-association \ + --cluster-name "$EKS_CLUSTER_NAME" \ + --namespace "$NS" \ + --service-account opentelemetry-collector \ + --role-arn "$COLLECTOR_ROLE_ARN" +``` + +### Step 5 — Restart workloads on each cluster EKS Pod Identity injects credentials only at pod creation time. -Recreate the workloads so new pods receive Pod Identity credentials: +So, you will see errors such as: + +- `AccessDeniedException` from `assumed-role/` in OpenTelemetry collector logs. +- `NoCredentialProviders` or `failed to retrieve credentials` in adapter logs. + +Recreate the workloads so new pods receive Pod Identity credentials. Run the commands that match your topology on each cluster. + +**Single-cluster topology:** + +```bash +kubectl -n "$NS" rollout restart deployment/tracing-adapter-aws-xray +kubectl -n "$NS" rollout restart deployment/opentelemetry-collector +``` + +**Observability plane cluster** — restart only the adapter: ```bash -kubectl -n "$NS" rollout restart deploy/tracing-adapter-aws-xray -kubectl -n "$NS" rollout restart deploy/opentelemetry-collector +kubectl -n "$NS" rollout restart deployment/tracing-adapter-aws-xray ``` -If the collector Deployment name differs because of your Helm release name, -inspect it first: +**Each data-plane cluster** — restart only the OpenTelemetry collector: ```bash -kubectl -n "$NS" get deploy +kubectl -n "$NS" rollout restart deployment/opentelemetry-collector ``` -Verify that Pod Identity was injected into a new adapter pod: +#### Verify Pod Identity injection + +Verify that Pod Identity was injected by checking a pod that runs in your topology. + +On clusters that run the **adapter** (single-cluster or observability plane): ```bash kubectl -n "$NS" get pod -l app=tracing-adapter-aws-xray -o name | head -1 \ @@ -412,8 +566,15 @@ kubectl -n "$NS" get pod -l app=tracing-adapter-aws-xray -o name | head -1 \ | grep -E "AWS_CONTAINER|eks-pod-identity-token" ``` -If these values are missing, check that the namespace and ServiceAccount names -in the Pod Identity associations exactly match the table above. +On clusters that run the **OpenTelemetry collector** (single-cluster or data-plane): + +```bash +kubectl -n "$NS" get pod -l app.kubernetes.io/name=opentelemetry-collector -o name | head -1 \ + | xargs -I {} kubectl -n "$NS" get {} -o yaml \ + | grep -E "AWS_CONTAINER|eks-pod-identity-token" +``` + +If these values are missing, check that the namespace and ServiceAccount names in the Pod Identity associations exactly match the table above. ## Installation on non-EKS clusters with static credentials @@ -444,6 +605,57 @@ Create an IAM user and attach the custom Create access keys for this IAM user and export them as shown above. +To create the IAM user and attach policies using the AWS CLI: + +```bash +export AWS_ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text) + +# Create the IAM user +aws iam create-user --user-name OpenChoreoXRayTracingUser + +# Create the combined policy +aws iam create-policy \ + --policy-name OpenChoreoXRayTracingCombinedPolicy \ + --policy-document "$(cat < **Note:** The Helm chart versions specified in the installation commands above are for the latest module version compatible with the development version of OpenChoreo. Refer to the compatibility table below to determine the appropriate module version for your OpenChoreo installation. + +| Module Version | OpenChoreo Version | +|----------------|--------------------| +| v0.2.x | v1.1.x |