diff --git a/.github/docs/azdo-perfstar-reader-setup.md b/.github/docs/azdo-perfstar-reader-setup.md new file mode 100644 index 00000000000..1aeec463e03 --- /dev/null +++ b/.github/docs/azdo-perfstar-reader-setup.md @@ -0,0 +1,165 @@ +# Syncing PerfStar Performance Data from AzDO to GitHub + +This document describes how the `read-azdo-perfstar.yml` workflow authenticates +to Azure DevOps, downloads PerfStar performance results, and commits them to the +`perf/dashboard` branch — without any PAT or stored credentials. + +## Architecture + +``` +GitHub Actions ──► GitHub OIDC Provider ──► Azure AD (federated credential) ──► AzDO REST API + (JWT id-token) (exchange for bearer token) (Download artifacts) +``` + +1. The workflow requests an OIDC JWT from GitHub's token endpoint +2. The JWT is exchanged with Azure AD via the managed identity's federated credential +3. Azure AD returns a bearer token scoped to Azure DevOps +4. The bearer token calls the AzDO REST API to find builds and download artifacts +5. Artifacts (`CrankAssetsThinnedGOLDWIN`, `CrankAssetsThinnedGOLDLIN`) are + extracted and committed to `perf/dashboard` branch under `data/YYYY-MM-DD/` + +> **Important:** The `azure/login` GitHub Action is **blocked by org policy** +> in the `dotnet` org. The workflow uses **manual OIDC token exchange via `curl`** +> instead — no third-party action dependencies. + +## Data Layout + +``` +perf/dashboard branch +└── data/ + ├── 2026-05-11/ + │ ├── GOLDWIN/ + │ │ ├── net8-console-app-rebuild-dotnet.json + │ │ └── ... + │ └── GOLDLIN/ + │ ├── net8-console-app-rebuild-dotnet.json + │ └── ... + ├── 2026-05-12/ + │ ├── GOLDWIN/ + │ └── GOLDLIN/ + └── ... +``` + +## Components + +| Component | Value | +|-----------|-------| +| Managed Identity | `msbuild-azdo-reader` | +| Client ID | Stored in `AZDO_READER_CLIENT_ID` secret | +| Tenant | Microsoft (`72f988bf-86f1-41af-91ab-2d7cd011db47`) | +| Subscription | `CodeTestingAgentDev` (`bb947664-5d18-4aaa-8bbe-40dde6075462`) | +| Resource Group | `CodeTestingAgent` | +| AzDO Org/Project | `DevDiv` / `DevDiv` | +| Target Pipeline | 25429 (PerfStar-Scheduled) | +| Artifacts | `CrankAssetsThinnedGOLDWIN`, `CrankAssetsThinnedGOLDLIN` | +| Access Level | Read-only (View builds) | + +## Setup Steps (Already Completed) + +### 1. Create the Managed Identity + +```bash +az account set --subscription "CodeTestingAgentDev" +az identity create --name "msbuild-azdo-reader" --resource-group "CodeTestingAgent" --location "eastus" +``` + +### 2. Add OIDC Federated Credentials + +These allow GitHub Actions in `dotnet/msbuild` to authenticate as the identity: + +```bash +# Main branch +az identity federated-credential create \ + --name github-dotnet-msbuild-main \ + --identity-name "msbuild-azdo-reader" \ + --resource-group "CodeTestingAgent" \ + --issuer "https://token.actions.githubusercontent.com" \ + --subject "repo:dotnet/msbuild:ref:refs/heads/main" \ + --audiences "api://AzureADTokenExchange" + +# Pull requests +az identity federated-credential create \ + --name github-dotnet-msbuild-pr \ + --identity-name "msbuild-azdo-reader" \ + --resource-group "CodeTestingAgent" \ + --issuer "https://token.actions.githubusercontent.com" \ + --subject "repo:dotnet/msbuild:pull_request" \ + --audiences "api://AzureADTokenExchange" +``` + +> **Subject claim is case-sensitive.** The repo name in the subject must match +> exactly (e.g. `dotnet/msbuild`). + +> **Microsoft tenant restriction:** Only repos in GitHub Enterprise orgs +> (`dotnet`, `microsoft`, etc.) work — personal forks fail with `AADSTS7002381`. + +### 3. Register in AzDO + +File a Service Ticket in DevDiv (Area: `DevDiv\VSEng\DDBuild\Operations`, +type: "AzDO Administration Request") to add the MI to the DevDiv org with +read access to pipelines 25429 and 25430. + +### 4. GitHub Secrets + +| Secret | Value | +|--------|-------| +| `AZDO_READER_CLIENT_ID` | Managed identity Client ID | +| `AZDO_READER_TENANT_ID` | `72f988bf-86f1-41af-91ab-2d7cd011db47` | +| `AZDO_READER_SUBSCRIPTION_ID` | `bb947664-5d18-4aaa-8bbe-40dde6075462` | + +## How the Token Flow Works + +``` +1. Workflow declares `permissions: { id-token: write }` at workflow level +2. Step "Get OIDC Token" requests a JWT from GitHub's token endpoint + ($ACTIONS_ID_TOKEN_REQUEST_URL, audience: api://AzureADTokenExchange) +3. Step "Exchange for AzDO Token" POSTs the JWT to Azure AD as a client_assertion + (grant_type=client_credentials, scope: 499b84ac-.../.default) +4. Azure AD validates the JWT against the federated credential, returns a bearer token +5. Subsequent steps call AzDO REST APIs with the bearer token +``` + +## Usage + +The workflow runs automatically on a daily schedule (6 pm UTC) and can also be +triggered manually from the Actions tab. + +### Scheduled runs + +Every day at 6 pm UTC the workflow: + +1. Looks at the `perf/dashboard` branch to find the latest `data/YYYY-MM-DD` folder +2. Processes the **next** day (latest + 1) +3. Finds the latest **scheduled** AzDO build for that date on pipeline 25429 +4. Downloads `CrankAssetsThinnedGOLDWIN` and `CrankAssetsThinnedGOLDLIN` (top-level `.json` only) +5. Commits to `perf/dashboard` under `data/YYYY-MM-DD/GOLDWIN/` and `GOLDLIN/` +6. If the processed date is not today, dispatches itself for the next date + +### Manual dispatch (`workflow_dispatch`) + +- **start_date** *(optional)*: Date to process (`YYYY-MM-DD`). Omit to auto-detect. +- **end_date** *(optional)*: Stop processing after this date. Defaults to today. + +When no `start_date` is given the workflow behaves identically to a scheduled run. + +### Build selection + +When multiple AzDO builds exist for a single day, the workflow prefers the latest +**scheduled** run. If no scheduled run is found it falls back to the latest run +of any trigger type. + +## Troubleshooting + +| Symptom | Cause | Fix | +|---------|-------|-----| +| `startup_failure` (no logs) | Third-party Action blocked by org policy | Use manual `curl` OIDC exchange, not `azure/login` | +| `AADSTS70021: No matching federated identity record` | Subject claim mismatch | Check exact casing and event type in federated credential | +| `AADSTS7002381: enterprise claim` | Personal fork | Only enterprise org repos work with Microsoft tenant | +| `403` from AzDO API | MI not added to DevDiv org, or no "View builds" permission | File DDBuild Operations ticket | +| `Failed to get OIDC token` | Missing `id-token: write` permission | Ensure permissions block is present | + +## References + +- [Use service principals and managed identities in Azure DevOps](https://learn.microsoft.com/en-us/azure/devops/integrate/get-started/authentication/service-principal-managed-identity) +- [GitHub OIDC docs](https://docs.github.com/en/actions/deployment/security-hardening-your-deployments/about-security-hardening-with-openid-connect) +- [AzDO Pipelines REST API](https://learn.microsoft.com/en-us/rest/api/azure/devops/pipelines/runs) diff --git a/.github/workflows/read-azdo-perfstar.yml b/.github/workflows/read-azdo-perfstar.yml new file mode 100644 index 00000000000..d2dd0f9cb3d --- /dev/null +++ b/.github/workflows/read-azdo-perfstar.yml @@ -0,0 +1,318 @@ +name: Sync PerfStar Data + +on: + schedule: + - cron: '0 18 * * *' # Daily at 6 pm UTC + workflow_dispatch: + inputs: + start_date: + description: 'Start date (YYYY-MM-DD). Defaults to day after latest stored data.' + required: false + default: '' + end_date: + description: 'End date (YYYY-MM-DD). Defaults to today.' + required: false + default: '' + pull_request: # Temporary — remove after testing + +permissions: + id-token: write + contents: write + actions: write + +env: + AZDO_ORG: DevDiv + AZDO_PROJECT: DevDiv + PIPELINE_ID: '25429' + DATA_BRANCH: perf/dashboard + +jobs: + sync-perfstar-data: + runs-on: ubuntu-latest + steps: + # ── 1. Determine which date to process ──────────────────────────── + - name: Determine processing date + id: date + env: + INPUT_START_DATE: ${{ inputs.start_date || '' }} + INPUT_END_DATE: ${{ inputs.end_date || '' }} + GH_TOKEN: ${{ github.token }} + run: | + set -euo pipefail + TODAY=$(date -u +%Y-%m-%d) + END_DATE="${INPUT_END_DATE:-$TODAY}" + + if [ -n "$INPUT_START_DATE" ]; then + TARGET_DATE="$INPUT_START_DATE" + echo "Using provided start date: $TARGET_DATE" + else + # Find the latest YYYY-MM-DD folder on the data branch + LATEST=$(gh api "repos/${{ github.repository }}/contents/data?ref=${{ env.DATA_BRANCH }}" \ + --jq '[.[].name | select(test("^[0-9]{4}-[0-9]{2}-[0-9]{2}$"))] | sort | last // empty' \ + 2>/dev/null || echo "") + + if [ -n "$LATEST" ]; then + TARGET_DATE=$(date -u -d "$LATEST + 1 day" +%Y-%m-%d) + echo "Latest stored data: $LATEST → processing next day: $TARGET_DATE" + else + TARGET_DATE=$(date -u -d "yesterday" +%Y-%m-%d) + echo "No existing data found → starting from: $TARGET_DATE" + fi + fi + + echo "target_date=${TARGET_DATE}" >> "$GITHUB_OUTPUT" + echo "end_date=${END_DATE}" >> "$GITHUB_OUTPUT" + echo "today=${TODAY}" >> "$GITHUB_OUTPUT" + + if [[ "$TARGET_DATE" > "$TODAY" ]]; then + echo "skip=true" >> "$GITHUB_OUTPUT" + echo "::notice::Target date $TARGET_DATE is in the future — nothing to do" + elif [[ "$TARGET_DATE" > "$END_DATE" ]]; then + echo "skip=true" >> "$GITHUB_OUTPUT" + echo "::notice::Target date $TARGET_DATE is past end date $END_DATE — nothing to do" + else + echo "skip=false" >> "$GITHUB_OUTPUT" + fi + + # ── 2. Skip early if data already stored ────────────────────────── + - name: Check if data already exists + if: steps.date.outputs.skip != 'true' + id: check + env: + GH_TOKEN: ${{ github.token }} + TARGET_DATE: ${{ steps.date.outputs.target_date }} + run: | + if gh api "repos/${{ github.repository }}/contents/data/${TARGET_DATE}?ref=${{ env.DATA_BRANCH }}" \ + > /dev/null 2>&1; then + echo "exists=true" >> "$GITHUB_OUTPUT" + echo "::notice::Data for ${TARGET_DATE} already exists on ${{ env.DATA_BRANCH }}" + else + echo "exists=false" >> "$GITHUB_OUTPUT" + fi + + # ── 3. Authenticate to AzDO via OIDC ────────────────────────────── + - name: Get OIDC Token + if: steps.date.outputs.skip != 'true' && steps.check.outputs.exists != 'true' + id: oidc + run: | + OIDC_TOKEN=$(curl -s -H "Authorization: bearer ${ACTIONS_ID_TOKEN_REQUEST_TOKEN}" \ + "${ACTIONS_ID_TOKEN_REQUEST_URL}&audience=api://AzureADTokenExchange" \ + | jq -r '.value') + if [ -z "$OIDC_TOKEN" ] || [ "$OIDC_TOKEN" = "null" ]; then + echo "::error::Failed to get OIDC token" + exit 1 + fi + echo "::add-mask::${OIDC_TOKEN}" + echo "oidc_token=${OIDC_TOKEN}" >> "$GITHUB_OUTPUT" + + - name: Exchange for AzDO Token + if: steps.date.outputs.skip != 'true' && steps.check.outputs.exists != 'true' + id: token + env: + OIDC_TOKEN: ${{ steps.oidc.outputs.oidc_token }} + run: | + AZURE_RESPONSE=$(curl -s -X POST \ + "https://login.microsoftonline.com/${{ secrets.AZDO_READER_TENANT_ID }}/oauth2/v2.0/token" \ + -d "grant_type=client_credentials" \ + -d "client_id=${{ secrets.AZDO_READER_CLIENT_ID }}" \ + -d "client_assertion_type=urn:ietf:params:oauth:client-assertion-type:jwt-bearer" \ + -d "client_assertion=${OIDC_TOKEN}" \ + -d "scope=499b84ac-1321-427f-aa17-267ca6975798/.default") + + AZDO_TOKEN=$(echo "$AZURE_RESPONSE" | jq -r '.access_token') + if [ -z "$AZDO_TOKEN" ] || [ "$AZDO_TOKEN" = "null" ]; then + echo "::error::Failed to get Azure AD token" + echo "$AZURE_RESPONSE" | jq '{error, error_description, error_codes}' 2>/dev/null || true + exit 1 + fi + echo "::add-mask::${AZDO_TOKEN}" + echo "azdo_token=${AZDO_TOKEN}" >> "$GITHUB_OUTPUT" + + # ── 4. Find the right AzDO build for the target date ────────────── + - name: Find AzDO run for target date + if: steps.date.outputs.skip != 'true' && steps.check.outputs.exists != 'true' + id: find_run + env: + AZDO_TOKEN: ${{ steps.token.outputs.azdo_token }} + TARGET_DATE: ${{ steps.date.outputs.target_date }} + run: | + BASE_URL="https://dev.azure.com/${AZDO_ORG}/${AZDO_PROJECT}/_apis" + + # Use the Pipelines API (known to work from run #4) + # Request more runs to cover several days of history + RUNS=$(curl -sS -H "Authorization: Bearer ${AZDO_TOKEN}" \ + "${BASE_URL}/pipelines/${PIPELINE_ID}/runs?api-version=7.1&\$top=100" || true) + + RUN_COUNT=$(echo "$RUNS" | jq '.count // 0' 2>/dev/null || echo 0) + echo "Total pipeline runs returned: $RUN_COUNT" + + # Filter for completed runs whose createdDate falls on TARGET_DATE. + # PerfStar runs often report result=failed even when artifacts are produced, + # so we accept any completed run and let the artifact download step fail gracefully. + MATCHING=$(echo "$RUNS" | jq --arg d "$TARGET_DATE" \ + '[.value[] | select(.state == "completed") | select(.createdDate | startswith($d))]' 2>/dev/null || echo "[]") + MATCH_COUNT=$(echo "$MATCHING" | jq 'length' 2>/dev/null || echo 0) + echo "Completed runs on $TARGET_DATE: $MATCH_COUNT" + + if [ "$MATCH_COUNT" -gt 0 ]; then + # When multiple runs exist, prefer scheduled runs over manual/CI triggers. + # The Pipelines API doesn't expose 'reason', so query the Build API. + IDS=$(echo "$MATCHING" | jq -r '[.[].id | tostring] | join(",")') + BUILD_DETAILS=$(curl -sS -H "Authorization: Bearer ${AZDO_TOKEN}" \ + "${BASE_URL}/build/builds?buildIds=${IDS}&api-version=7.1" 2>/dev/null || echo '{}') + + SCHEDULED_ID=$(echo "$BUILD_DETAILS" | jq -r \ + '[.value[] | select(.reason == "schedule")] | sort_by(.id) | last | .id // empty' 2>/dev/null || echo '') + + if [ -n "$SCHEDULED_ID" ]; then + BEST_RUN="$SCHEDULED_ID" + BEST_RESULT=$(echo "$MATCHING" | jq -r --argjson id "$SCHEDULED_ID" '.[] | select(.id == $id) | .result // "unknown"') + echo "Preferred scheduled run: $BEST_RUN (result=$BEST_RESULT)" + else + BEST_RUN=$(echo "$MATCHING" | jq -r 'first | .id // empty') + BEST_RESULT=$(echo "$MATCHING" | jq -r 'first | .result // "unknown"') + echo "No scheduled run found, using latest: $BEST_RUN (result=$BEST_RESULT)" + fi + echo "Selected run: $BEST_RUN (result=$BEST_RESULT)" + echo "run_id=${BEST_RUN}" >> "$GITHUB_OUTPUT" + echo "" >> "$GITHUB_STEP_SUMMARY" + echo "**Selected run: $BEST_RUN** (result=$BEST_RESULT)" >> "$GITHUB_STEP_SUMMARY" + else + echo "::warning::No completed builds found for $TARGET_DATE on pipeline ${PIPELINE_ID}" + echo "" >> "$GITHUB_STEP_SUMMARY" + echo "**No matching build found for $TARGET_DATE**" >> "$GITHUB_STEP_SUMMARY" + echo "run_id=" >> "$GITHUB_OUTPUT" + fi + exit 0 + + # ── 5. Download & extract artifacts ──────────────────────────────── + - name: Download and extract artifacts + if: steps.find_run.outputs.run_id != '' + id: download + env: + AZDO_TOKEN: ${{ steps.token.outputs.azdo_token }} + BUILD_ID: ${{ steps.find_run.outputs.run_id }} + TARGET_DATE: ${{ steps.date.outputs.target_date }} + run: | + set -euo pipefail + BASE_URL="https://dev.azure.com/${AZDO_ORG}/${AZDO_PROJECT}/_apis" + + ARTIFACT_COUNT=0 + for SUFFIX in GOLDWIN GOLDLIN; do + ARTIFACT="CrankAssetsThinned${SUFFIX}" + echo "::group::Downloading ${ARTIFACT}" + + ARTIFACT_INFO=$(curl -sS -H "Authorization: Bearer ${AZDO_TOKEN}" \ + "${BASE_URL}/build/builds/${BUILD_ID}/artifacts?artifactName=${ARTIFACT}&api-version=7.1") + + DOWNLOAD_URL=$(echo "$ARTIFACT_INFO" | jq -r '.resource.downloadUrl') + if [ -z "$DOWNLOAD_URL" ] || [ "$DOWNLOAD_URL" = "null" ]; then + echo "::warning::Artifact ${ARTIFACT} not found in build ${BUILD_ID} — skipping" + echo "::endgroup::" + continue + fi + + curl -sS -L -H "Authorization: Bearer ${AZDO_TOKEN}" \ + "$DOWNLOAD_URL" -o "${ARTIFACT}.zip" + + mkdir -p "output/data/${TARGET_DATE}/${SUFFIX}" + + # Extract only top-level .json files — skip _manifest/ and subdirs + unzip -j -o "${ARTIFACT}.zip" "${ARTIFACT}/*.json" \ + -d "output/data/${TARGET_DATE}/${SUFFIX}/" + + rm -f "${ARTIFACT}.zip" + + COUNT=$(find "output/data/${TARGET_DATE}/${SUFFIX}" -maxdepth 1 -name '*.json' | wc -l) + echo "Extracted ${COUNT} JSON files for ${SUFFIX}" + ARTIFACT_COUNT=$((ARTIFACT_COUNT + 1)) + echo "::endgroup::" + done + + if [ "$ARTIFACT_COUNT" -eq 0 ]; then + echo "::error::No artifacts found in build ${BUILD_ID} — nothing to store" + exit 1 + fi + + echo "downloaded=true" >> "$GITHUB_OUTPUT" + + # Step summary + echo "## PerfStar data for ${TARGET_DATE}" >> "$GITHUB_STEP_SUMMARY" + echo "AzDO build: [${BUILD_ID}](https://dev.azure.com/${AZDO_ORG}/${AZDO_PROJECT}/_build/results?buildId=${BUILD_ID})" >> "$GITHUB_STEP_SUMMARY" + echo "" >> "$GITHUB_STEP_SUMMARY" + for SUFFIX in GOLDWIN GOLDLIN; do + echo "
${SUFFIX}" >> "$GITHUB_STEP_SUMMARY" + echo "" >> "$GITHUB_STEP_SUMMARY" + echo '```' >> "$GITHUB_STEP_SUMMARY" + ls "output/data/${TARGET_DATE}/${SUFFIX}/" >> "$GITHUB_STEP_SUMMARY" + echo '```' >> "$GITHUB_STEP_SUMMARY" + echo "
" >> "$GITHUB_STEP_SUMMARY" + echo "" >> "$GITHUB_STEP_SUMMARY" + done + + # ── 6. Commit to the perf/dashboard branch ──────────────────────── + - name: Push to perf/dashboard branch + if: steps.download.outputs.downloaded == 'true' + env: + TARGET_DATE: ${{ steps.date.outputs.target_date }} + GH_TOKEN: ${{ github.token }} + run: | + set -euo pipefail + + git clone --depth 1 --branch "${{ env.DATA_BRANCH }}" \ + "https://x-access-token:${GH_TOKEN}@github.com/${{ github.repository }}.git" \ + perf-branch + + cp -r "output/data/${TARGET_DATE}" "perf-branch/data/${TARGET_DATE}" + + cd perf-branch + git config user.name "github-actions[bot]" + git config user.email "github-actions[bot]@users.noreply.github.com" + git add "data/${TARGET_DATE}" + git commit -m "perf: add ${TARGET_DATE} performance data" + git push + + # ── 7. Chain to the next date if needed ──────────────────────────── + - name: Trigger next date + if: steps.date.outputs.skip != 'true' && steps.check.outputs.exists != 'true' + env: + TARGET_DATE: ${{ steps.date.outputs.target_date }} + END_DATE: ${{ steps.date.outputs.end_date }} + TODAY: ${{ steps.date.outputs.today }} + DOWNLOADED: ${{ steps.download.outputs.downloaded }} + GH_TOKEN: ${{ github.token }} + run: | + # If nothing was downloaded (no builds/artifacts), don't chain + if [ "$DOWNLOADED" != "true" ]; then + echo "No data downloaded for $TARGET_DATE — not chaining" + exit 0 + fi + + # Only chain when we haven't reached today yet + if [[ ! "$TARGET_DATE" < "$TODAY" ]]; then + echo "Reached today ($TODAY) — stopping chain" + exit 0 + fi + + NEXT_DATE=$(date -u -d "$TARGET_DATE + 1 day" +%Y-%m-%d) + + if [[ "$NEXT_DATE" > "$TODAY" ]]; then + echo "Next date $NEXT_DATE is in the future — stopping" + exit 0 + fi + + if [[ "$NEXT_DATE" > "$END_DATE" ]]; then + echo "Next date $NEXT_DATE is past end date $END_DATE — stopping" + exit 0 + fi + + echo "Dispatching next run: start_date=$NEXT_DATE end_date=$END_DATE" + + REF="${{ github.event_name == 'pull_request' && github.head_ref || github.ref_name }}" + + if gh workflow run read-azdo-perfstar.yml --ref "$REF" -f "start_date=$NEXT_DATE" -f "end_date=$END_DATE"; then + echo "Dispatched successfully" + else + echo "::warning::Could not dispatch next run (exit $?) — may need manual re-trigger" + fi + exit 0