Skip to content

CI: Replace nightly image build with S3 aiter wheel installation#303

Open
gyohuangxin wants to merge 28 commits intomainfrom
ci/use-s3-aiter-wheel
Open

CI: Replace nightly image build with S3 aiter wheel installation#303
gyohuangxin wants to merge 28 commits intomainfrom
ci/use-s3-aiter-wheel

Conversation

@gyohuangxin
Copy link
Member

Summary

  • Remove build_atom_image job to eliminate nightly image pull and rebuild overhead
  • Use rocm/pytorch:latest as the base image instead of rocm/atom-dev:latest
  • Install the latest amd-aiter wheel from S3 (s3://framework-whls-nightlies/whl-staging/gfx942-gfx950/) at runtime instead of building from source
  • Install ATOM and other dependencies (lm-eval, hf_transfer, pybind11) directly inside the container

Test plan

  • Verify the S3 wheel download step finds and installs the latest amd-aiter wheel correctly
  • Verify ATOM and all dependencies install successfully in the container
  • Run the full ATOM test matrix to confirm inference and accuracy tests pass

- Remove build_atom_image job that pulled nightly image and rebuilt
- Use rocm/pytorch:latest as base image directly
- Install latest amd-aiter wheel from S3 bucket at runtime
- Install ATOM and dependencies inside the container
- Remove fork-specific Dockerfile build logic and Docker Login step
Copilot AI review requested due to automatic review settings March 11, 2026 07:19
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the ATOM CI workflow to stop building/pushing a custom nightly Docker image and instead run tests from a base ROCm PyTorch image while installing amd-aiter, ATOM, and Python dependencies at container runtime.

Changes:

  • Removes the build_atom_image job and related fork/non-fork image handling.
  • Switches the test container base image to rocm/pytorch:latest.
  • Adds steps to download/install the latest amd-aiter wheel from S3 and to install ATOM + dependencies inside the running container.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +219 to +220
pip install awscli 2>/dev/null || true

Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pip install awscli 2>/dev/null || true can leave the container without the aws CLI (e.g., transient pip failure) and then the next aws s3 ... will fail with a less actionable command not found. Consider failing the step if awscli installation fails and/or add an explicit command -v aws check with a clear error message.

Suggested change
pip install awscli 2>/dev/null || true
if ! command -v aws >/dev/null 2>&1; then
echo '=== Installing awscli ==='
if ! pip install awscli; then
echo 'ERROR: Failed to install awscli; cannot download amd-aiter wheel from S3'
exit 1
fi
fi
if ! command -v aws >/dev/null 2>&1; then
echo 'ERROR: aws CLI is not available after installation attempt; cannot download amd-aiter wheel from S3'
exit 1
fi

Copilot uses AI. Check for mistakes.
Comment on lines +221 to +241
echo '=== Finding latest amd-aiter wheel from S3 ==='
LATEST_WHL=\$(aws s3 ls ${{ env.AITER_S3_BUCKET }}/ --no-sign-request \
| grep 'amd_aiter.*\.whl' \
| sort -k1,2 \
| tail -1 \
| awk '{print \$4}')

if [ -z \"\$LATEST_WHL\" ]; then
echo 'ERROR: No amd-aiter wheel found in S3 bucket'
exit 1
fi

echo \"Latest wheel: \$LATEST_WHL\"
echo '=== Downloading wheel ==='
aws s3 cp ${{ env.AITER_S3_BUCKET }}/\$LATEST_WHL /tmp/\$LATEST_WHL --no-sign-request

echo '=== Uninstalling existing amd-aiter ==='
pip uninstall -y amd-aiter || true

echo '=== Installing amd-aiter from wheel ==='
pip install /tmp/\$LATEST_WHL
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Selecting and installing the "latest" wheel from S3 makes CI runs non-reproducible and can introduce flakiness (a newer wheel could land between reruns of the same commit). Consider pinning the wheel (e.g., via a version/commit marker file in S3, an explicit env override, or a deterministic naming convention) and logging the resolved wheel’s checksum/version so failures can be reproduced.

Suggested change
echo '=== Finding latest amd-aiter wheel from S3 ==='
LATEST_WHL=\$(aws s3 ls ${{ env.AITER_S3_BUCKET }}/ --no-sign-request \
| grep 'amd_aiter.*\.whl' \
| sort -k1,2 \
| tail -1 \
| awk '{print \$4}')
if [ -z \"\$LATEST_WHL\" ]; then
echo 'ERROR: No amd-aiter wheel found in S3 bucket'
exit 1
fi
echo \"Latest wheel: \$LATEST_WHL\"
echo '=== Downloading wheel ==='
aws s3 cp ${{ env.AITER_S3_BUCKET }}/\$LATEST_WHL /tmp/\$LATEST_WHL --no-sign-request
echo '=== Uninstalling existing amd-aiter ==='
pip uninstall -y amd-aiter || true
echo '=== Installing amd-aiter from wheel ==='
pip install /tmp/\$LATEST_WHL
if [ -n \"\${AITER_WHEEL_NAME:-}\" ]; then
echo '=== Using pinned amd-aiter wheel from AITER_WHEEL_NAME ==='
SELECTED_WHL=\"\$AITER_WHEEL_NAME\"
else
echo '=== Finding latest amd-aiter wheel from S3 ==='
SELECTED_WHL=\$(aws s3 ls ${{ env.AITER_S3_BUCKET }}/ --no-sign-request \
| grep 'amd_aiter.*\.whl' \
| sort -k1,2 \
| tail -1 \
| awk '{print \$4}')
fi
if [ -z \"\$SELECTED_WHL\" ]; then
echo 'ERROR: No amd-aiter wheel found in S3 bucket'
exit 1
fi
echo \"Selected wheel: \$SELECTED_WHL\"
echo '=== Downloading wheel ==='
aws s3 cp ${{ env.AITER_S3_BUCKET }}/\$SELECTED_WHL /tmp/\$SELECTED_WHL --no-sign-request
echo '=== Wheel SHA256 checksum ==='
sha256sum /tmp/\$SELECTED_WHL || echo 'WARNING: sha256sum command failed'
echo '=== Uninstalling existing amd-aiter ==='
pip uninstall -y amd-aiter || true
echo '=== Installing amd-aiter from wheel ==='
pip install /tmp/\$SELECTED_WHL

Copilot uses AI. Check for mistakes.
Comment on lines +233 to +242
echo \"Latest wheel: \$LATEST_WHL\"
echo '=== Downloading wheel ==='
aws s3 cp ${{ env.AITER_S3_BUCKET }}/\$LATEST_WHL /tmp/\$LATEST_WHL --no-sign-request

echo '=== Uninstalling existing amd-aiter ==='
pip uninstall -y amd-aiter || true

echo '=== Installing amd-aiter from wheel ==='
pip install /tmp/\$LATEST_WHL

Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The workflow downloads and installs a wheel from a public S3 location with --no-sign-request but does not perform any integrity verification before pip install. To reduce supply-chain risk, consider fetching a corresponding checksum/signature (e.g., .sha256) and verifying it before installation, or using a signed URL / authenticated access.

Copilot uses AI. Check for mistakes.
…runner

- Add aws-actions/configure-aws-credentials step with IAM role
- Download aiter wheel on the runner (not inside container)
- Copy wheel into container via docker cp for installation
- Remove AWS credentials and S3 bucket configuration
- Download latest aiter wheel from ROCm/aiter CI artifacts via GitHub API
- Use GITHUB_TOKEN for cross-repo artifact access
Copilot AI review requested due to automatic review settings March 11, 2026 08:31
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +159 to +166
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
cat <<EOF > Dockerfile.mod
FROM ${{ env.ATOM_BASE_NIGTHLY_IMAGE }}
RUN pip install -U lm-eval[api]
RUN pip show lm-eval || true
RUN pip install hf_transfer
RUN pip show hf_transfer || true
RUN echo "=== Aiter version BEFORE uninstall ===" && pip show amd-aiter || true
RUN pip uninstall -y amd-aiter
RUN pip install --upgrade "pybind11>=3.0.1"
RUN pip show pybind11
RUN rm -rf /app/aiter-test
RUN git clone https://github.com/ROCm/aiter.git /app/aiter-test && \\
cd /app/aiter-test && \\
git checkout HEAD && \\
git submodule sync && git submodule update --init --recursive && \\
MAX_JOBS=64 PREBUILD_KERNELS=0 GPU_ARCHS=gfx950 python3 setup.py develop
RUN echo "=== Aiter version AFTER installation ===" && pip show amd-aiter || true

RUN echo "=== ATOM version BEFORE uninstall ===" && pip show atom || true
RUN pip uninstall -y atom
RUN rm -rf /app/ATOM
RUN git clone ${{ env.GITHUB_REPO_URL }} /app/ATOM && \\
cd /app/ATOM && \\
git checkout ${{ env.GITHUB_COMMIT_SHA }} && \\
pip install -e .

RUN echo "=== ATOM version AFTER installation ===" && pip show atom || true
EOF
set -euo pipefail
echo "=== Finding latest aiter-whl-main artifact from ROCm/aiter ==="

- name: Build Docker image for forked repo
if: (matrix.run_on_pr == true || github.event_name != 'pull_request') && github.event.pull_request.head.repo.fork
run: |
docker build --pull --network=host \
--no-cache \
-t atom_test:ci \
-f Dockerfile.mod .
ARTIFACT_JSON=$(gh api "repos/ROCm/aiter/actions/artifacts?per_page=100" \
--jq '[.artifacts[] | select(.name | startswith("aiter-whl-main")) | select(.expired == false)] | sort_by(.created_at) | last')
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The workflow uses ${{ secrets.GITHUB_TOKEN }} (scoped to this repo) to call repos/ROCm/aiter/actions/artifacts/.... GITHUB_TOKEN cannot access private resources (including Actions artifacts) in a different repository, so this step will fail on push/schedule and on PRs (especially forks). Consider switching to the S3 wheel source described in the PR, or minting a token with actions:read on ROCm/aiter (GitHub App/PAT) and adding a fork-safe fallback/skip path.

Copilot uses AI. Check for mistakes.
Comment on lines +157 to +163
- name: Download latest aiter wheel from CI artifacts
if: matrix.run_on_pr == true || github.event_name != 'pull_request'
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
cat <<EOF > Dockerfile.mod
FROM ${{ env.ATOM_BASE_NIGTHLY_IMAGE }}
RUN pip install -U lm-eval[api]
RUN pip show lm-eval || true
RUN pip install hf_transfer
RUN pip show hf_transfer || true
RUN echo "=== Aiter version BEFORE uninstall ===" && pip show amd-aiter || true
RUN pip uninstall -y amd-aiter
RUN pip install --upgrade "pybind11>=3.0.1"
RUN pip show pybind11
RUN rm -rf /app/aiter-test
RUN git clone https://github.com/ROCm/aiter.git /app/aiter-test && \\
cd /app/aiter-test && \\
git checkout HEAD && \\
git submodule sync && git submodule update --init --recursive && \\
MAX_JOBS=64 PREBUILD_KERNELS=0 GPU_ARCHS=gfx950 python3 setup.py develop
RUN echo "=== Aiter version AFTER installation ===" && pip show amd-aiter || true

RUN echo "=== ATOM version BEFORE uninstall ===" && pip show atom || true
RUN pip uninstall -y atom
RUN rm -rf /app/ATOM
RUN git clone ${{ env.GITHUB_REPO_URL }} /app/ATOM && \\
cd /app/ATOM && \\
git checkout ${{ env.GITHUB_COMMIT_SHA }} && \\
pip install -e .

RUN echo "=== ATOM version AFTER installation ===" && pip show atom || true
EOF
set -euo pipefail
echo "=== Finding latest aiter-whl-main artifact from ROCm/aiter ==="
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR description says the latest amd-aiter wheel is installed from S3 (s3://framework-whls-nightlies/...), but the workflow now downloads a GitHub Actions artifact from ROCm/aiter. If S3 is the intended source of truth, this step should be updated to match (or the PR description updated) to avoid confusion about provenance and required credentials.

Copilot uses AI. Check for mistakes.
Copilot AI review requested due to automatic review settings March 11, 2026 14:33
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +165 to +168
| jq '[.artifacts[] | select(.name | startswith("aiter-whl-main")) | select(.expired == false)] | sort_by(.created_at) | last')

ARTIFACT_NAME=$(echo "$ARTIFACT_JSON" | jq -r '.name')
ARTIFACT_ID=$(echo "$ARTIFACT_JSON" | jq -r '.id')
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The workflow selects the most recently created aiter-whl-main artifact and installs it without any version/commit pinning or integrity verification. This makes CI non-reproducible and increases supply-chain risk if an unexpected artifact is published. Consider pinning to a specific aiter commit/SHA (or a versioned wheel path), and/or verifying a checksum/signature before installing.

Copilot uses AI. Check for mistakes.
Copilot AI review requested due to automatic review settings March 12, 2026 03:24
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 8 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +167 to +173
API_URL="https://api.github.com"
AUTH_HEADER="Authorization: token ${{ secrets.GITHUB_TOKEN }}"
AITER_TEST_WORKFLOW_ID=179476100

# Search Aiter Test workflow runs on main branch for one that has an aiter-whl artifact
RUNS=$(curl -s -H "$AUTH_HEADER" \
"$API_URL/repos/ROCm/aiter/actions/workflows/$AITER_TEST_WORKFLOW_ID/runs?per_page=100&branch=main&event=push")
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This uses the workflow run’s GITHUB_TOKEN to download artifacts from a different repository (ROCm/aiter). The token is scoped to the current repo and typically cannot access cross-repo Actions artifacts, which will commonly fail with 403/404. Use a dedicated secret (PAT or GitHub App token) that has actions:read on ROCm/aiter, or switch to the PR-described S3 wheel source to avoid cross-repo artifact auth entirely.

Copilot uses AI. Check for mistakes.
Comment on lines +178 to +180
ARTIFACT_JSON=$(curl -s -H "$AUTH_HEADER" \
"$API_URL/repos/ROCm/aiter/actions/runs/$RUN_ID/artifacts" \
| jq '[.artifacts[] | select(.name | startswith("aiter-whl-main")) | select(.expired == false)] | first')
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This uses the workflow run’s GITHUB_TOKEN to download artifacts from a different repository (ROCm/aiter). The token is scoped to the current repo and typically cannot access cross-repo Actions artifacts, which will commonly fail with 403/404. Use a dedicated secret (PAT or GitHub App token) that has actions:read on ROCm/aiter, or switch to the PR-described S3 wheel source to avoid cross-repo artifact auth entirely.

Copilot uses AI. Check for mistakes.

ARTIFACT_ID=""
ARTIFACT_NAME=""
for RUN_ID in $(echo "$RUNS" | jq -r '.workflow_runs[].id'); do
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The script selects the first run (most recent by API ordering) that contains a non-expired artifact, but does not require the run to be status=completed and conclusion=success. This can pick artifacts from failed or in-progress runs and lead to installing a broken wheel. Filter workflow runs to completed+successful before searching artifacts (or validate run conclusion before accepting the artifact).

Suggested change
for RUN_ID in $(echo "$RUNS" | jq -r '.workflow_runs[].id'); do
for RUN_ID in $(echo "$RUNS" | jq -r '.workflow_runs[] | select(.status=="completed" and .conclusion=="success") | .id'); do

Copilot uses AI. Check for mistakes.
Comment on lines +185 to +188
echo "Found artifact in run $RUN_ID: $ARTIFACT_NAME (ID: $ARTIFACT_ID)"
break
fi
done
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The script selects the first run (most recent by API ordering) that contains a non-expired artifact, but does not require the run to be status=completed and conclusion=success. This can pick artifacts from failed or in-progress runs and lead to installing a broken wheel. Filter workflow runs to completed+successful before searching artifacts (or validate run conclusion before accepting the artifact).

Copilot uses AI. Check for mistakes.

ARTIFACT_ID=""
ARTIFACT_NAME=""
for RUN_ID in $(echo "$RUNS" | jq -r '.workflow_runs[].id'); do
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This step assumes jq and unzip are present on the runner host. Previously, jq was installed inside the Docker image, but now parsing/downloading happens before the container is started. Add an explicit dependency-install step for the runner (or rewrite parsing to avoid jq, e.g., using python -c), otherwise this will fail on self-hosted runners that don’t preinstall these tools.

Copilot uses AI. Check for mistakes.
curl -s -L -H "$AUTH_HEADER" \
"$API_URL/repos/ROCm/aiter/actions/artifacts/$ARTIFACT_ID/zip" \
-o /tmp/aiter-whl.zip
unzip -o /tmp/aiter-whl.zip -d /tmp/aiter-whl
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This step assumes jq and unzip are present on the runner host. Previously, jq was installed inside the Docker image, but now parsing/downloading happens before the container is started. Add an explicit dependency-install step for the runner (or rewrite parsing to avoid jq, e.g., using python -c), otherwise this will fail on self-hosted runners that don’t preinstall these tools.

Copilot uses AI. Check for mistakes.
Comment on lines +203 to +209
AITER_WHL=$(ls /tmp/aiter-whl/amd_aiter*.whl 2>/dev/null | head -1)
if [ -z "$AITER_WHL" ]; then
echo "ERROR: No amd_aiter wheel found in artifact"
ls -la /tmp/aiter-whl/
exit 1
fi

Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the artifact ever contains multiple matching wheels, head -1 is not guaranteed to pick the newest/desired build. Prefer selecting deterministically (e.g., version-sort then take the highest) or enforce that exactly one wheel is present and fail otherwise.

Suggested change
AITER_WHL=$(ls /tmp/aiter-whl/amd_aiter*.whl 2>/dev/null | head -1)
if [ -z "$AITER_WHL" ]; then
echo "ERROR: No amd_aiter wheel found in artifact"
ls -la /tmp/aiter-whl/
exit 1
fi
AITER_WHL_CANDIDATES=$(ls -1 /tmp/aiter-whl/amd_aiter*.whl 2>/dev/null | sort -V || true)
if [ -z "$AITER_WHL_CANDIDATES" ]; then
echo "ERROR: No amd_aiter wheel found in artifact"
ls -la /tmp/aiter-whl/
exit 1
fi
AITER_WHL=$(echo "$AITER_WHL_CANDIDATES" | tail -n 1)

Copilot uses AI. Check for mistakes.
fi

echo "Downloaded wheel: $AITER_WHL"
echo "AITER_WHL_PATH=$AITER_WHL" >> $GITHUB_ENV
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$GITHUB_ENV should be quoted when redirecting to avoid issues with unexpected whitespace/shell expansion. Use a quoted redirect target (and consider using printf for robustness).

Suggested change
echo "AITER_WHL_PATH=$AITER_WHL" >> $GITHUB_ENV
echo "AITER_WHL_PATH=$AITER_WHL" >> "$GITHUB_ENV"

Copilot uses AI. Check for mistakes.
Copilot AI review requested due to automatic review settings March 12, 2026 05:56
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +273 to +274
WHL_NAME=$(basename "${{ env.AITER_WHL_PATH }}")
docker cp "${{ env.AITER_WHL_PATH }}" "$CONTAINER_NAME:/tmp/$WHL_NAME"
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AITER_WHL_PATH is set via $GITHUB_ENV, but later referenced using the ${{ env.AITER_WHL_PATH }} expression context. Values written to $GITHUB_ENV are available as shell environment variables in subsequent steps (e.g., $AITER_WHL_PATH), but are not reliably available through the env expression context, which can make basename/docker cp run with an empty path. Use the runtime shell env var (or step outputs) instead of ${{ env.* }} here.

Suggested change
WHL_NAME=$(basename "${{ env.AITER_WHL_PATH }}")
docker cp "${{ env.AITER_WHL_PATH }}" "$CONTAINER_NAME:/tmp/$WHL_NAME"
WHL_NAME=$(basename "$AITER_WHL_PATH")
docker cp "$AITER_WHL_PATH" "$CONTAINER_NAME:/tmp/$WHL_NAME"

Copilot uses AI. Check for mistakes.
Comment on lines +167 to +200
API_URL="https://api.github.com"
AUTH_HEADER="Authorization: token ${{ secrets.GITHUB_TOKEN }}"
AITER_TEST_WORKFLOW_ID=179476100

# Search Aiter Test workflow runs on main branch for one that has an aiter-whl artifact
RUNS=$(curl -s -H "$AUTH_HEADER" \
"$API_URL/repos/ROCm/aiter/actions/workflows/$AITER_TEST_WORKFLOW_ID/runs?per_page=100&branch=main&event=push")

ARTIFACT_ID=""
ARTIFACT_NAME=""
for RUN_ID in $(echo "$RUNS" | jq -r '.workflow_runs[].id'); do
ARTIFACT_JSON=$(curl -s -H "$AUTH_HEADER" \
"$API_URL/repos/ROCm/aiter/actions/runs/$RUN_ID/artifacts" \
| jq '[.artifacts[] | select(.name | startswith("aiter-whl-main")) | select(.expired == false)] | first')

if [ "$ARTIFACT_JSON" != "null" ] && [ -n "$ARTIFACT_JSON" ]; then
ARTIFACT_ID=$(echo "$ARTIFACT_JSON" | jq -r '.id')
ARTIFACT_NAME=$(echo "$ARTIFACT_JSON" | jq -r '.name')
echo "Found artifact in run $RUN_ID: $ARTIFACT_NAME (ID: $ARTIFACT_ID)"
break
fi
done

- name: Build Docker image for forked repo
if: (matrix.run_on_pr == true || github.event_name != 'pull_request') && github.event.pull_request.head.repo.fork
run: |
docker build --pull --network=host \
--no-cache \
-t atom_test:ci \
-f Dockerfile.mod .
if [ -z "$ARTIFACT_ID" ] || [ "$ARTIFACT_ID" = "null" ]; then
echo "ERROR: No aiter-whl-main artifact found in recent Aiter Test runs"
exit 1
fi

echo "=== Downloading artifact ==="
mkdir -p /tmp/aiter-whl
curl -s -L -H "$AUTH_HEADER" \
"$API_URL/repos/ROCm/aiter/actions/artifacts/$ARTIFACT_ID/zip" \
-o /tmp/aiter-whl.zip
unzip -o /tmp/aiter-whl.zip -d /tmp/aiter-whl
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This step attempts to download a workflow artifact from the ROCm/aiter repository using ${{ secrets.GITHUB_TOKEN }}. The workflow token is scoped to the current repository and typically cannot access private/collaborator-only resources (including Actions artifacts) in other repositories; this is likely to fail with 403s, especially on PRs from forks where secrets/PATs aren’t available. Consider switching to the PR-described S3 wheel source (or another publicly readable location), or use a dedicated PAT with explicit access and a fork-safe fallback behavior.

Copilot uses AI. Check for mistakes.
@gyohuangxin gyohuangxin marked this pull request as draft March 12, 2026 08:56
@gyohuangxin gyohuangxin marked this pull request as ready for review March 13, 2026 02:50
Copilot AI review requested due to automatic review settings March 13, 2026 02:50
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 6 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +239 to +244
RUNS=$(curl -s -H "$AUTH_HEADER" \
"$API_URL/repos/ROCm/aiter/actions/workflows/$AITER_TEST_WORKFLOW_ID/runs?per_page=100&branch=main&event=push")

ARTIFACT_ID=""
ARTIFACT_NAME=""
for RUN_ID in $(echo "$RUNS" | jq -r '.workflow_runs[].id'); do
Comment on lines +228 to +233
- name: Download latest aiter wheel from CI artifacts
if: matrix.run_on_pr == true || github.event_name != 'pull_request'
run: |
cat <<EOF > Dockerfile.mod
FROM ${{ env.ATOM_BASE_NIGTHLY_IMAGE }}
RUN pip install -U lm-eval[api]
RUN pip show lm-eval || true
RUN pip install hf_transfer
RUN pip show hf_transfer || true
RUN echo "=== Aiter version BEFORE uninstall ===" && pip show amd-aiter || true
RUN pip uninstall -y amd-aiter
RUN pip install --upgrade "pybind11>=3.0.1"
RUN pip show pybind11
RUN rm -rf /app/aiter-test
RUN git clone https://github.com/ROCm/aiter.git /app/aiter-test && \\
cd /app/aiter-test && \\
git checkout HEAD && \\
git submodule sync && git submodule update --init --recursive && \\
MAX_JOBS=64 PREBUILD_KERNELS=0 GPU_ARCHS=gfx950 python3 setup.py develop
RUN echo "=== Aiter version AFTER installation ===" && pip show amd-aiter || true
RUN echo "=== ATOM version BEFORE uninstall ===" && pip show atom || true
RUN pip uninstall -y atom
RUN rm -rf /app/ATOM
RUN git clone ${{ env.GITHUB_REPO_URL }} /app/ATOM && \\
cd /app/ATOM && \\
git checkout ${{ env.GITHUB_COMMIT_SHA }} && \\
pip install -e .
set -euo pipefail
echo "=== Finding latest aiter-whl-main artifact from ROCm/aiter ==="


env:
ATOM_BASE_NIGTHLY_IMAGE: rocm/atom-dev:latest
ATOM_BASE_IMAGE: rocm/pytorch:latest
Comment on lines 102 to 104
atom:
needs: [pre-checks, build_atom_image]
needs: [pre-checks]
name: ATOM Test
echo "=== Finding latest aiter-whl-main artifact from ROCm/aiter ==="

API_URL="https://api.github.com"
AUTH_HEADER="Authorization: token ${{ secrets.GITHUB_TOKEN }}"
Comment on lines +236 to +237
AITER_TEST_WORKFLOW_ID=179476100

The build_atom_image job is no longer needed since we now install
aiter from CI artifact wheels and ATOM/dependencies at runtime
inside the container.
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +46 to +52
API_URL="https://api.github.com"
AUTH_HEADER="Authorization: token ${{ secrets.GITHUB_TOKEN }}"
AITER_TEST_WORKFLOW_ID=179476100

RUNS=$(curl -s -H "$AUTH_HEADER" \
"$API_URL/repos/ROCm/aiter/actions/workflows/$AITER_TEST_WORKFLOW_ID/runs?per_page=100&branch=main&event=push")


ARTIFACT_ID=""
ARTIFACT_NAME=""
for RUN_ID in $(echo "$RUNS" | jq -r '.workflow_runs[].id'); do
Comment on lines +48 to +49
AITER_TEST_WORKFLOW_ID=179476100

Comment on lines +41 to +44
- name: Find and download latest aiter wheel
run: |
cat <<EOF > Dockerfile.mod
FROM ${{ env.ATOM_BASE_NIGTHLY_IMAGE }}
RUN pip install -U lm-eval[api]
RUN pip show lm-eval || true
RUN pip install hf_transfer
RUN pip show hf_transfer || true
RUN echo "=== Aiter version BEFORE uninstall ===" && pip show amd-aiter || true
RUN pip uninstall -y amd-aiter
RUN pip install --upgrade "pybind11>=3.0.1"
RUN pip show pybind11
RUN wget https://github.com/stedolan/jq/releases/download/jq-1.7/jq-linux64 -O jq
RUN chmod +x jq
RUN mv jq /usr/local/bin/jq
RUN rm -rf /app/aiter-test
RUN git clone --depth 1 https://github.com/ROCm/aiter.git /app/aiter-test && \\
cd /app/aiter-test && \\
git checkout HEAD && \\
git submodule sync && git submodule update --init --recursive && \\
MAX_JOBS=64 PREBUILD_KERNELS=0 GPU_ARCHS=gfx950 python3 setup.py develop
RUN echo "=== Aiter version AFTER installation ===" && pip show amd-aiter || true

RUN echo "=== ATOM version BEFORE uninstall ===" && pip show atom || true
RUN pip uninstall -y atom
RUN rm -rf /app/ATOM
RUN git clone ${{ env.GITHUB_REPO_URL }} /app/ATOM && \\
cd /app/ATOM && \\
git checkout ${{ env.GITHUB_COMMIT_SHA }} && \\
pip install -e .

RUN echo "=== ATOM version AFTER installation ===" && pip show atom || true
EOF
set -euo pipefail
echo "=== Finding latest aiter-whl-main artifact from ROCm/aiter ==="
Comment on lines 25 to 26
GITHUB_REPO_URL: ${{ github.event.pull_request.head.repo.clone_url || 'https://github.com/ROCm/ATOM.git' }}
GITHUB_COMMIT_SHA: ${{ github.event.pull_request.head.sha || github.event.head_commit.id }}
Copilot AI review requested due to automatic review settings March 16, 2026 06:18
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 7 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.



docker run -dt --device=/dev/kfd $DEVICE_FLAG \
docker run -dt --pull always --device=/dev/kfd $DEVICE_FLAG \
Comment on lines +47 to +51
AUTH_HEADER="Authorization: token ${{ secrets.GITHUB_TOKEN }}"
AITER_TEST_WORKFLOW_ID=179476100

RUNS=$(curl -s -H "$AUTH_HEADER" \
"$API_URL/repos/ROCm/aiter/actions/workflows/$AITER_TEST_WORKFLOW_ID/runs?per_page=100&branch=main&event=push")
Comment on lines +41 to +45
- name: Find and download latest aiter wheel
run: |
cat <<EOF > Dockerfile.mod
FROM ${{ env.ATOM_BASE_NIGTHLY_IMAGE }}
RUN pip install -U lm-eval[api]
RUN pip show lm-eval || true
RUN pip install hf_transfer
RUN pip show hf_transfer || true
RUN echo "=== Aiter version BEFORE uninstall ===" && pip show amd-aiter || true
RUN pip uninstall -y amd-aiter
RUN pip install --upgrade "pybind11>=3.0.1"
RUN pip show pybind11
RUN wget https://github.com/stedolan/jq/releases/download/jq-1.7/jq-linux64 -O jq
RUN chmod +x jq
RUN mv jq /usr/local/bin/jq
RUN rm -rf /app/aiter-test
RUN git clone --depth 1 https://github.com/ROCm/aiter.git /app/aiter-test && \\
cd /app/aiter-test && \\
git checkout HEAD && \\
git submodule sync && git submodule update --init --recursive && \\
MAX_JOBS=64 PREBUILD_KERNELS=0 GPU_ARCHS=gfx950 python3 setup.py develop
RUN echo "=== Aiter version AFTER installation ===" && pip show amd-aiter || true

RUN echo "=== ATOM version BEFORE uninstall ===" && pip show atom || true
RUN pip uninstall -y atom
RUN rm -rf /app/ATOM
RUN git clone ${{ env.GITHUB_REPO_URL }} /app/ATOM && \\
cd /app/ATOM && \\
git checkout ${{ env.GITHUB_COMMIT_SHA }} && \\
pip install -e .

RUN echo "=== ATOM version AFTER installation ===" && pip show atom || true
EOF
set -euo pipefail
echo "=== Finding latest aiter-whl-main artifact from ROCm/aiter ==="

Comment on lines +50 to +52
RUNS=$(curl -s -H "$AUTH_HEADER" \
"$API_URL/repos/ROCm/aiter/actions/workflows/$AITER_TEST_WORKFLOW_ID/runs?per_page=100&branch=main&event=push")

Comment on lines +48 to +49
AITER_TEST_WORKFLOW_ID=179476100

Comment on lines 97 to 100
atom:
needs: [pre-checks, build_atom_image]
needs: [pre-checks, download_aiter_wheel]
name: ATOM Test
strategy:

env:
ATOM_BASE_NIGTHLY_IMAGE: rocm/atom-dev:latest
ATOM_BASE_IMAGE: rocm/pytorch:latest
Copilot AI review requested due to automatic review settings March 16, 2026 17:19
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 6 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.



docker run -dt --device=/dev/kfd $DEVICE_FLAG \
docker run -dt --pull always --device=/dev/kfd $DEVICE_FLAG \
Comment on lines 25 to 26
GITHUB_REPO_URL: ${{ github.event.pull_request.head.repo.clone_url || 'https://github.com/ROCm/ATOM.git' }}
GITHUB_COMMIT_SHA: ${{ github.event.pull_request.head.sha || github.event.head_commit.id }}
Comment on lines +46 to +52
API_URL="https://api.github.com"
AUTH_HEADER="Authorization: token ${{ secrets.GITHUB_TOKEN }}"
AITER_TEST_WORKFLOW_ID=179476100

RUNS=$(curl -s -H "$AUTH_HEADER" \
"$API_URL/repos/ROCm/aiter/actions/workflows/$AITER_TEST_WORKFLOW_ID/runs?per_page=100&branch=main&event=push")

Comment on lines +41 to +48
- name: Find and download latest aiter wheel
run: |
cat <<EOF > Dockerfile.mod
FROM ${{ env.ATOM_BASE_NIGTHLY_IMAGE }}
RUN pip install -U lm-eval[api]
RUN pip show lm-eval || true
RUN pip install hf_transfer
RUN pip show hf_transfer || true
RUN echo "=== Aiter version BEFORE uninstall ===" && pip show amd-aiter || true
RUN pip uninstall -y amd-aiter
RUN pip install --upgrade "pybind11>=3.0.1"
RUN pip show pybind11
RUN wget https://github.com/stedolan/jq/releases/download/jq-1.7/jq-linux64 -O jq
RUN chmod +x jq
RUN mv jq /usr/local/bin/jq
RUN rm -rf /app/aiter-test
RUN git clone --depth 1 https://github.com/ROCm/aiter.git /app/aiter-test && \\
cd /app/aiter-test && \\
git checkout HEAD && \\
git submodule sync && git submodule update --init --recursive && \\
MAX_JOBS=64 PREBUILD_KERNELS=0 GPU_ARCHS=gfx950 python3 setup.py develop
RUN echo "=== Aiter version AFTER installation ===" && pip show amd-aiter || true

RUN echo "=== ATOM version BEFORE uninstall ===" && pip show atom || true
RUN pip uninstall -y atom
RUN rm -rf /app/ATOM
RUN git clone ${{ env.GITHUB_REPO_URL }} /app/ATOM && \\
cd /app/ATOM && \\
git checkout ${{ env.GITHUB_COMMIT_SHA }} && \\
pip install -e .

RUN echo "=== ATOM version AFTER installation ===" && pip show atom || true
EOF
set -euo pipefail
echo "=== Finding latest aiter-whl-main artifact from ROCm/aiter ==="

API_URL="https://api.github.com"
AUTH_HEADER="Authorization: token ${{ secrets.GITHUB_TOKEN }}"
AITER_TEST_WORKFLOW_ID=179476100
Comment on lines +46 to +49
API_URL="https://api.github.com"
AUTH_HEADER="Authorization: token ${{ secrets.GITHUB_TOKEN }}"
AITER_TEST_WORKFLOW_ID=179476100


env:
ATOM_BASE_NIGTHLY_IMAGE: rocm/atom-dev:latest
ATOM_BASE_IMAGE: rocm/pytorch:latest
Copilot AI review requested due to automatic review settings March 17, 2026 07:56
@gyohuangxin gyohuangxin force-pushed the ci/use-s3-aiter-wheel branch from a5eaede to a366f00 Compare March 17, 2026 07:56
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 5 comments.

Comments suppressed due to low confidence (1)

.github/workflows/atom-test.yaml:273

  • docker run mounts /workspace twice (both ${GITHUB_WORKSPACE:-$PWD} and ${{ github.workspace }}) and also sets -w /workspace twice. This duplication is easy to miss when editing the command and can lead to confusion about which path is authoritative; consider removing the duplicate -v/-w entries.
          docker run -dt --pull always --device=/dev/kfd $DEVICE_FLAG \
          -v "${GITHUB_WORKSPACE:-$PWD}":/workspace \
          $MODEL_MOUNT \
          -w /workspace \
          --ipc=host --group-add video \
          --shm-size=16G \
          --privileged \
          --cap-add=SYS_PTRACE \
          -e HF_TOKEN="${HF_TOKEN:-}" \
          --env-file /tmp/env_file.txt \
          --security-opt seccomp=unconfined \
          --ulimit memlock=-1 \
          --ulimit stack=67108864 \
          -e ATOM_DISABLE_MMAP=true \
          -v "${{ github.workspace }}:/workspace" \
          -w /workspace \
          --name "$CONTAINER_NAME" \

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +41 to +52
- name: Find and download latest aiter wheel
run: |
cat <<EOF > Dockerfile.mod
FROM ${{ env.ATOM_BASE_NIGTHLY_IMAGE }}
RUN pip install -U lm-eval[api]
RUN pip show lm-eval || true
RUN pip install hf_transfer
RUN pip show hf_transfer || true
RUN echo "=== Aiter version BEFORE uninstall ===" && pip show amd-aiter || true
RUN pip uninstall -y amd-aiter
RUN pip install --upgrade "pybind11>=3.0.1"
RUN pip show pybind11
RUN wget https://github.com/stedolan/jq/releases/download/jq-1.7/jq-linux64 -O jq
RUN chmod +x jq
RUN mv jq /usr/local/bin/jq
RUN rm -rf /app/aiter-test
RUN git clone --depth 1 https://github.com/ROCm/aiter.git /app/aiter-test && \\
cd /app/aiter-test && \\
git checkout HEAD && \\
git submodule sync && git submodule update --init --recursive && \\
MAX_JOBS=64 PREBUILD_KERNELS=0 GPU_ARCHS=gfx950 python3 setup.py develop
RUN echo "=== Aiter version AFTER installation ===" && pip show amd-aiter || true

RUN echo "=== ATOM version BEFORE uninstall ===" && pip show atom || true
RUN pip uninstall -y atom
RUN rm -rf /app/ATOM
RUN git clone ${{ env.GITHUB_REPO_URL }} /app/ATOM && \\
cd /app/ATOM && \\
git checkout ${{ env.GITHUB_COMMIT_SHA }} && \\
pip install -e .

RUN echo "=== ATOM version AFTER installation ===" && pip show atom || true
EOF
set -euo pipefail
echo "=== Finding latest aiter-whl-main artifact from ROCm/aiter ==="

API_URL="https://api.github.com"
AUTH_HEADER="Authorization: token ${{ secrets.GITHUB_TOKEN }}"
AITER_TEST_WORKFLOW_ID=179476100

RUNS=$(curl -s -H "$AUTH_HEADER" \
"$API_URL/repos/ROCm/aiter/actions/workflows/$AITER_TEST_WORKFLOW_ID/runs?per_page=100&branch=main&event=push")

Comment on lines +46 to +77
API_URL="https://api.github.com"
AUTH_HEADER="Authorization: token ${{ secrets.GITHUB_TOKEN }}"
AITER_TEST_WORKFLOW_ID=179476100

RUNS=$(curl -s -H "$AUTH_HEADER" \
"$API_URL/repos/ROCm/aiter/actions/workflows/$AITER_TEST_WORKFLOW_ID/runs?per_page=100&branch=main&event=push")

ARTIFACT_ID=""
ARTIFACT_NAME=""
for RUN_ID in $(echo "$RUNS" | jq -r '.workflow_runs[].id'); do
ARTIFACT_JSON=$(curl -s -H "$AUTH_HEADER" \
"$API_URL/repos/ROCm/aiter/actions/runs/$RUN_ID/artifacts" \
| jq '[.artifacts[] | select(.name | startswith("aiter-whl-main")) | select(.expired == false)] | first')

if [ "$ARTIFACT_JSON" != "null" ] && [ -n "$ARTIFACT_JSON" ]; then
ARTIFACT_ID=$(echo "$ARTIFACT_JSON" | jq -r '.id')
ARTIFACT_NAME=$(echo "$ARTIFACT_JSON" | jq -r '.name')
echo "Found artifact in run $RUN_ID: $ARTIFACT_NAME (ID: $ARTIFACT_ID)"
break
fi
done

- name: Build Docker image
if: ${{ !github.event.pull_request.head.repo.fork }}
run: |
docker build --pull --network=host \
--no-cache \
-t atom_test:ci \
-f Dockerfile.mod .
if [ -z "$ARTIFACT_ID" ] || [ "$ARTIFACT_ID" = "null" ]; then
echo "ERROR: No aiter-whl-main artifact found in recent Aiter Test runs"
exit 1
fi

- name: Push Docker image
if: ${{ !github.event.pull_request.head.repo.fork }}
run: |
IMAGE_TAG=rocm/atom-dev:pre-build-${{ env.GITHUB_COMMIT_SHA }}
docker tag atom_test:ci $IMAGE_TAG
echo "${{ secrets.DOCKER_PASSWORD }}" | docker login -u ${{ secrets.DOCKER_USERNAME }} --password-stdin
docker push $IMAGE_TAG
echo "=== Downloading artifact ==="
mkdir -p aiter-whl
curl -s -L -H "$AUTH_HEADER" \
"$API_URL/repos/ROCm/aiter/actions/artifacts/$ARTIFACT_ID/zip" \
-o aiter-whl.zip
Comment on lines +48 to +51
AITER_TEST_WORKFLOW_ID=179476100

RUNS=$(curl -s -H "$AUTH_HEADER" \
"$API_URL/repos/ROCm/aiter/actions/workflows/$AITER_TEST_WORKFLOW_ID/runs?per_page=100&branch=main&event=push")
Comment on lines +50 to +66
RUNS=$(curl -s -H "$AUTH_HEADER" \
"$API_URL/repos/ROCm/aiter/actions/workflows/$AITER_TEST_WORKFLOW_ID/runs?per_page=100&branch=main&event=push")

ARTIFACT_ID=""
ARTIFACT_NAME=""
for RUN_ID in $(echo "$RUNS" | jq -r '.workflow_runs[].id'); do
ARTIFACT_JSON=$(curl -s -H "$AUTH_HEADER" \
"$API_URL/repos/ROCm/aiter/actions/runs/$RUN_ID/artifacts" \
| jq '[.artifacts[] | select(.name | startswith("aiter-whl-main")) | select(.expired == false)] | first')

if [ "$ARTIFACT_JSON" != "null" ] && [ -n "$ARTIFACT_JSON" ]; then
ARTIFACT_ID=$(echo "$ARTIFACT_JSON" | jq -r '.id')
ARTIFACT_NAME=$(echo "$ARTIFACT_JSON" | jq -r '.name')
echo "Found artifact in run $RUN_ID: $ARTIFACT_NAME (ID: $ARTIFACT_ID)"
break
fi
done
Comment on lines 25 to 26
GITHUB_REPO_URL: ${{ github.event.pull_request.head.repo.clone_url || 'https://github.com/ROCm/ATOM.git' }}
GITHUB_COMMIT_SHA: ${{ github.event.pull_request.head.sha || github.event.head_commit.id }}
Copilot AI review requested due to automatic review settings March 18, 2026 06:07
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +46 to +52
API_URL="https://api.github.com"
AUTH_HEADER="Authorization: token ${{ secrets.GITHUB_TOKEN }}"
AITER_TEST_WORKFLOW_ID=179476100

RUNS=$(curl -s -H "$AUTH_HEADER" \
"$API_URL/repos/ROCm/aiter/actions/workflows/$AITER_TEST_WORKFLOW_ID/runs?per_page=100&branch=main&event=push")

Comment on lines +44 to +59
echo "=== Finding latest aiter-whl-main artifact from ROCm/aiter ==="

API_URL="https://api.github.com"
AUTH_HEADER="Authorization: token ${{ secrets.GITHUB_TOKEN }}"
AITER_TEST_WORKFLOW_ID=179476100

RUNS=$(curl -s -H "$AUTH_HEADER" \
"$API_URL/repos/ROCm/aiter/actions/workflows/$AITER_TEST_WORKFLOW_ID/runs?per_page=100&branch=main&event=push")

ARTIFACT_ID=""
ARTIFACT_NAME=""
for RUN_ID in $(echo "$RUNS" | jq -r '.workflow_runs[].id'); do
ARTIFACT_JSON=$(curl -s -H "$AUTH_HEADER" \
"$API_URL/repos/ROCm/aiter/actions/runs/$RUN_ID/artifacts" \
| jq '[.artifacts[] | select(.name | startswith("aiter-whl-main")) | select(.expired == false)] | first')


env:
ATOM_BASE_NIGHTLY_IMAGE: rocm/atom-dev:latest
ATOM_BASE_IMAGE: rocm/pytorch:latest


docker run -dt --device=/dev/kfd $DEVICE_FLAG \
docker run -dt --pull always --device=/dev/kfd $DEVICE_FLAG \
Copilot AI review requested due to automatic review settings March 19, 2026 07:20
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +46 to +49
download_aiter_wheel:
if: ${{ needs.check-signal.result == 'success' && (!github.event.pull_request || github.event.pull_request.draft == false) }}
needs: [check-signal]
name: Build ATOM image
runs-on: build-only-atom
name: Download aiter wheel
Comment on lines +57 to +69
API_URL="https://api.github.com"
AUTH_HEADER="Authorization: token ${{ secrets.GITHUB_TOKEN }}"
AITER_TEST_WORKFLOW_ID=179476100

RUNS=$(curl -s -H "$AUTH_HEADER" \
"$API_URL/repos/ROCm/aiter/actions/workflows/$AITER_TEST_WORKFLOW_ID/runs?per_page=100&branch=main&event=push")

ARTIFACT_ID=""
ARTIFACT_NAME=""
for RUN_ID in $(echo "$RUNS" | jq -r '.workflow_runs[].id'); do
ARTIFACT_JSON=$(curl -s -H "$AUTH_HEADER" \
"$API_URL/repos/ROCm/aiter/actions/runs/$RUN_ID/artifacts" \
| jq '[.artifacts[] | select(.name | startswith("aiter-whl-main")) | select(.expired == false)] | first')
Comment on lines +55 to +62
echo "=== Finding latest aiter-whl-main artifact from ROCm/aiter ==="

API_URL="https://api.github.com"
AUTH_HEADER="Authorization: token ${{ secrets.GITHUB_TOKEN }}"
AITER_TEST_WORKFLOW_ID=179476100

RUNS=$(curl -s -H "$AUTH_HEADER" \
"$API_URL/repos/ROCm/aiter/actions/workflows/$AITER_TEST_WORKFLOW_ID/runs?per_page=100&branch=main&event=push")

env:
ATOM_BASE_NIGHTLY_IMAGE: rocm/atom-dev:latest
ATOM_BASE_IMAGE: rocm/pytorch:latest
Copilot AI review requested due to automatic review settings March 19, 2026 08:35
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 9 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +55 to +62
echo "=== Finding latest aiter-whl-main artifact from ROCm/aiter ==="

API_URL="https://api.github.com"
AUTH_HEADER="Authorization: token ${{ secrets.GITHUB_TOKEN }}"
AITER_TEST_WORKFLOW_ID=179476100

RUNS=$(curl -s -H "$AUTH_HEADER" \
"$API_URL/repos/ROCm/aiter/actions/workflows/$AITER_TEST_WORKFLOW_ID/runs?per_page=100&branch=main&event=push")
Comment on lines +59 to +62
AITER_TEST_WORKFLOW_ID=179476100

RUNS=$(curl -s -H "$AUTH_HEADER" \
"$API_URL/repos/ROCm/aiter/actions/workflows/$AITER_TEST_WORKFLOW_ID/runs?per_page=100&branch=main&event=push")
with:
name: aiter-whl
path: /tmp/aiter-whl



docker run -dt --device=/dev/kfd $DEVICE_FLAG \
docker run -dt --pull always --device=/dev/kfd $DEVICE_FLAG \
Comment on lines +314 to +316
echo '=== Installing amd-aiter from wheel ==='
pip install /tmp/$WHL_NAME

name: Build ATOM image
runs-on: build-only-atom
name: Download aiter wheel
runs-on: ubuntu-latest
Comment on lines +100 to +105

- name: Upload aiter wheel
uses: actions/upload-artifact@v4
with:
name: aiter-whl
path: aiter-whl/amd_aiter*.whl
Comment on lines +92 to +96
AITER_WHL=$(ls -t aiter-whl/amd_aiter*.whl 2>/dev/null | head -1)
if [ -z "$AITER_WHL" ]; then
echo "ERROR: No amd_aiter wheel found in artifact"
ls -la aiter-whl/
exit 1
Comment on lines +298 to +315
AITER_WHL=$(ls -t /tmp/aiter-whl/amd_aiter*.whl 2>/dev/null | head -1)
if [ -z "$AITER_WHL" ]; then
echo "ERROR: No amd_aiter wheel found"
ls -la /tmp/aiter-whl/
exit 1
fi

echo "=== Copying wheel into container ==="
WHL_NAME=$(basename "$AITER_WHL")
docker cp "$AITER_WHL" "$CONTAINER_NAME:/tmp/$WHL_NAME"

docker exec "$CONTAINER_NAME" bash -lc "
set -euo pipefail
echo '=== Uninstalling existing amd-aiter ==='
pip uninstall -y amd-aiter || true

echo '=== Installing amd-aiter from wheel ==='
pip install /tmp/$WHL_NAME
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants