Skip to content

Bring dotnet/.github workflow templates into alignment with issue-labeler v2.1.0 #130

@jeffhandley

Description

@jeffhandley

Bring dotnet/.github workflow templates into alignment with dotnet/issue-labeler v2.1.0

Scope

This issue tracks the full set of changes required in dotnet/.github/workflow-templates to align the publicly published Issue Labeler workflow templates with the v2.1.0 release of dotnet/issue-labeler (release SHA 98f1d9b85686147ffdc155547cdd3469546c4e26, predictor image digest sha256:79b5359ae5c85b90d9a98dcf3864163ae0a248f6d60687b7e81c8a189b8219a6).

The companion update to this wiki landed in dotnet/issue-labeler.wiki (commits b850f8d "wiki: align with v2.1.0" and cb4862e "wiki: update v2.1.0 SHAs and clarify release-body / tag / branch HEAD relationship") and provides an end-to-end reference walkthrough that adopters can compare against the published templates.

No template PR is opened by this issue — this is a planning/tracking issue describing the change set so maintainers (or a follow-up effort) can execute the refresh as a coordinated unit.

Why now

  • v2.0 → v2.1.0 introduces GitHub Discussions support end-to-end (download, train, test, restore, predict, promote).
  • v2.1.0 fixes a latent string-coercion bug in ALLOW_FAILURE handling that silently broke manual bulk-dispatch runs on v2.0. The same bug is currently present in the published dotnet/.github predict templates.
  • v2.1.0 introduces per-type prediction thresholds, per-type max_labels, a dry_run input, a 10-row training-data minimum with friendly skip behavior, a friendlier fork guard, and a discussions: write/issues: write permission pairing for the (new) discussion predict workflow.

Templates affected

Five existing templates plus one new template (and their .properties.json siblings):

  • workflow-templates/labeler-train.yml
  • workflow-templates/labeler-predict-issues.yml
  • workflow-templates/labeler-predict-pulls.yml
  • workflow-templates/labeler-promote.yml
  • workflow-templates/labeler-cache-retention.yml
  • NEW workflow-templates/labeler-predict-discussions.yml + .properties.json

(1) Mechanical SHA repin

Replace every uses: reference from the current v2.0.0 SHA to the v2.1.0 SHA:

- uses: dotnet/issue-labeler/<action>@46125e85e6a568dc712f358c39f35317366f5eed # v2.0.0
+ uses: dotnet/issue-labeler/<action>@98f1d9b85686147ffdc155547cdd3469546c4e26 # v2.1.0

Important

The v2.1.0 release SHA (98f1d9b85686147ffdc155547cdd3469546c4e26) is what consumers must pin to. This is what the release body advertises and what the v2.1.0-release tag points to. The v2.1.0 branch HEAD may advance past this SHA with post-release maintenance commits (currently at dedab068d5b2fcc6f85fee4ea9e9be79c3f12449) — do not pin to the branch HEAD. See the Release Process wiki page for the full relationship between these SHAs.

(2) CRITICAL bug fix — ALLOW_FAILURE polarity & fromJSON() wrap

The v2.0 predict templates carry this pattern:

env:
  ALLOW_FAILURE: ${{ github.event_name == 'workflow_dispatch' }}
# ...
      - name: "Restore … model from cache"
        with:
          fail-on-cache-miss: ${{ env.ALLOW_FAILURE }}
      - name: "Predict …"
        # ...
        continue-on-error: ${{ !env.ALLOW_FAILURE }}

This is broken. In GitHub Actions expression syntax, every non-empty string is truthy — including the strings "true" and "false" themselves. !env.ALLOW_FAILURE therefore evaluates to false unconditionally, regardless of which branch produced the value. Likewise fail-on-cache-miss: ${{ env.ALLOW_FAILURE }} is always truthy.

The symptom: manual workflow_dispatch bulk-labeling runs (where the intent is to fail loudly on errors) instead silently continue, and the cache-miss guard is on when it should be off. This is the opposite of the design intent.

v2.1.0 fixes this with two coupled changes that must be applied together — either change alone is still wrong:

 env:
-  ALLOW_FAILURE: ${{ github.event_name == 'workflow_dispatch' }}
+  ALLOW_FAILURE: ${{ github.event_name != 'workflow_dispatch' }}
# ...
       - name: "Restore … model from cache"
         with:
-          fail-on-cache-miss: ${{ env.ALLOW_FAILURE }}
+          fail-on-cache-miss: ${{ !fromJSON(env.ALLOW_FAILURE) }}
       - name: "Predict …"
         # ...
-        continue-on-error: ${{ !env.ALLOW_FAILURE }}
+        continue-on-error: ${{ fromJSON(env.ALLOW_FAILURE) }}

fromJSON("true")true; fromJSON("false")false. With the wrapper, the boolean expressions evaluate as intended; without it, they collapse to a string-truthiness check.

Warning

Existing v2.0 consumers (notably dotnet/runtime) mirror this broken pattern in their checked-in workflow YAML. Refreshing the published templates will fix the bug for new adopters; existing adopters need a separate heads-up so they can apply the same two-change fix to their copies. See the wiki Onboarding callout for the explanatory text adopters can reference.

Apply this fix in all predict templates: labeler-predict-issues.yml, labeler-predict-pulls.yml, and the new labeler-predict-discussions.yml.

(3) Capability changes to surface in the templates

Per-type prediction thresholds (read from repo vars)

Both labeler-train.yml and the predict templates should read per-type thresholds with a sensible fallback chain:

env:
  THRESHOLD: ${{ vars.ISSUE_LABELER_PREDICTION_THRESHOLD_<TYPE> || vars.ISSUE_LABELER_PREDICTION_THRESHOLD || '0.15' }}

Variables (all optional, all repository-level):

Variable Read by Fallback
ISSUE_LABELER_PREDICTION_THRESHOLD train + predict 0.15
ISSUE_LABELER_PREDICTION_THRESHOLD_ISSUES train + predict (issues) ISSUE_LABELER_PREDICTION_THRESHOLD0.15
ISSUE_LABELER_PREDICTION_THRESHOLD_PULLS train + predict (pulls) ISSUE_LABELER_PREDICTION_THRESHOLD0.15
ISSUE_LABELER_PREDICTION_THRESHOLD_DISCUSSIONS train + predict (discussions) ISSUE_LABELER_PREDICTION_THRESHOLD0.15

Per-type max_labels (read from repo vars)

env:
  MAX_LABELS: ${{ vars.ISSUE_LABELER_MAX_LABELS_<TYPE> || vars.ISSUE_LABELER_MAX_LABELS || '1' }}

Same fallback pattern, plus per-type vars ISSUE_LABELER_MAX_LABELS_ISSUES, _PULLS, _DISCUSSIONS. Range is 110.

Training type dispatch options

labeler-train.yml should change:

       type:
-        description: "Issues or Pull Requests"
+        description: "Issues, Discussions, or Pull Requests"
         type: choice
         required: true
-        default: "Both"
+        default: "All"
         options:
-          - "Both"
+          - "All"
           - "Issues"
+          - "Discussions"
           - "Pull Requests"

…and every contains(fromJSON('["Both", …]'), …) predicate switches to '["All", …]', with new download-discussions, check-discussions-data, train-discussions, test-discussions jobs mirroring the existing issues/pulls jobs.

10-row training-data minimum

Each check-<type>-data job restores the cached TSV, counts lines, and emits has_training_data=true|false. Train and test jobs depend on the corresponding check and gate on needs.check-<type>-data.outputs.has_training_data == 'true'. When the data is too small, the run notes the skip in the job summary instead of failing. (See the v2.1.0 reference workflow for the canonical implementation.)

Fork guard

-    if: ${{ github.event_name == 'workflow_dispatch' || github.repository_owner == 'dotnet' }}
+    if: ${{ github.event_name == 'workflow_dispatch' || !github.event.repository.fork }}

Apply in all predict workflows and in labeler-cache-retention.yml. The new guard works correctly for repositories outside the dotnet org while still skipping automatic runs on forks.

labeler-predict-pulls.yml permissions

The predict-pulls workflow needs both pull-requests: write and issues: write. GitHub's labels API treats issue and pull request labels uniformly through the same endpoint, and that endpoint requires issues: write even when the target is a pull request.

     permissions:
+      issues: write
       pull-requests: write

Promote workflow — add discussions

       pulls:
         description: "Pulls: Promote Model"
         type: boolean
         required: true
+      discussions:
+        description: "Discussions: Promote Model"
+        type: boolean
+        required: true

Add a promote-discussions job mirroring promote-issues/promote-pulls and passing type: "discussions". The promote action accepts discussions as of v2.1.0.

Also move the actions: write permission from a top-level permissions: block to per-job permissions blocks (the v2.1.0 reference workflow uses per-job permissions everywhere, with a top-level permissions: {}).

Cache retention — add discussions to matrix

       matrix:
-        type: ["issues", "pulls"]
+        type: ["issues", "pulls", "discussions"]

Consumers without Discussions can remove "discussions" from their copy.

(4) NEW template — labeler-predict-discussions.yml

Add a new template + properties file modeled on the v2.1.0 reference workflow.

Workflow shape (key elements):

name: "Labeler: Predict (Discussions)"

on:
  discussion:
    types: [created]
  workflow_dispatch:
    inputs:
      discussions: { required: true, description: "…" }
      dry_run:     { required: false, type: boolean, default: false }
      cache_key:   { required: true, default: "ACTIVE" }

permissions: {}

env:
  ALLOW_FAILURE: ${{ github.event_name != 'workflow_dispatch' }}
  LABEL_PREFIX: "area-"
  THRESHOLD: ${{ vars.ISSUE_LABELER_PREDICTION_THRESHOLD_DISCUSSIONS || vars.ISSUE_LABELER_PREDICTION_THRESHOLD || '0.15' }}
  DEFAULT_LABEL: "needs-area-label"
  MAX_LABELS: ${{ vars.ISSUE_LABELER_MAX_LABELS_DISCUSSIONS || vars.ISSUE_LABELER_MAX_LABELS || '1' }}

jobs:
  predict-discussion-label:
    if: ${{ github.event_name == 'workflow_dispatch' || !github.event.repository.fork }}
    runs-on: ubuntu-latest
    permissions:
      discussions: write
      issues: write
    steps:
      - uses: dotnet/issue-labeler/restore@98f1d9b85686147ffdc155547cdd3469546c4e26 # v2.1.0
        with:
          type: discussions
          fail-on-cache-miss: ${{ !fromJSON(env.ALLOW_FAILURE) }}
          quiet: true
          cache_key: ${{ github.event.inputs.cache_key || 'ACTIVE' }}
      - if: ${{ steps.restore-model.outputs.cache-hit == 'true' }}
        uses: dotnet/issue-labeler/predict@98f1d9b85686147ffdc155547cdd3469546c4e26 # v2.1.0
        with:
          discussions: ${{ github.event.inputs.discussions || github.event.discussion.number }}
          label_prefix: ${{ env.LABEL_PREFIX }}
          threshold: ${{ env.THRESHOLD }}
          default_label: ${{ env.DEFAULT_LABEL }}
          dry_run: ${{ github.event_name == 'workflow_dispatch' && github.event.inputs.dry_run || 'false' }}
          max_labels: ${{ env.MAX_LABELS }}
        env:
          GITHUB_TOKEN: ${{ github.token }}
          INPUT_ISSUES_MODEL: labeler-cache/discussions-model.zip
        continue-on-error: ${{ fromJSON(env.ALLOW_FAILURE) }}

Key requirements:

  • Permissions: discussions: write and issues: write.
  • INPUT_ISSUES_MODEL: labeler-cache/discussions-model.zip on the predict step — discussions reuse the issues pipeline but the model lives at the discussions path; this env override tells the predictor where to find it.
  • .properties.json description should make clear the template is for repos with GitHub Discussions enabled that want automated area labeling on discussion creation.

(5) Governance — resolve circular source-of-truth wording

Today the dotnet/.github workflow templates and this wiki each contain a header comment that frames itself as imported from the other:

  • dotnet/.github/workflow-templates/labeler-*.yml header → "imported from dotnet/issue-labeler/wiki/Onboarding"
  • dotnet/issue-labeler.wiki/Onboarding.md snippet headers (now removed in the wiki update) → previously said "imported and updated from dotnet/issue-labeler/wiki/Onboarding" (a circular reference)

Recommended resolution:

  • dotnet/.github/workflow-templates/labeler-*.yml becomes the canonical published template.
  • The wiki Onboarding page is the walkthrough/example that mirrors the canonical templates.
  • Drop the "imported from" header comments from the published templates entirely (or replace them with a one-line pointer such as # Reference: https://github.com/dotnet/issue-labeler — pinned to v2.1.0).

The wiki has been updated unilaterally on its side; the dotnet/.github side change should land alongside this template refresh.

(6) Capability matrix README

Add a short workflow-templates/README.md (or similar) explaining which v2.1.0 capabilities the templates intentionally do/do not surface, so consumers and the wiki know what is deliberate versus accidental. Suggested coverage:

v2.1.0 capability Surfaced in templates? Notes
Per-type prediction thresholds via repo vars ? Decide whether to expose via env (template) or document as repo-level config
Per-type max_labels via repo vars ? Same
dry_run input on predict workflows ? Useful for adopters validating a new model on demand
discussions predict template new Opt-in; only useful for repos with Discussions enabled
10-row training-data minimum yes The check-data jobs are part of the canonical train template
!github.event.repository.fork guard yes Works for any org, not just dotnet

Filling this table is part of the work — the rows above just enumerate the decisions that need to be made.

(7) Open question — should the templates expose dry_run and max_labels as with: inputs?

The v2.1.0 predict workflows accept a dry_run boolean dispatch input (default false) and read MAX_LABELS from env. The dotnet/.github templates can either:

  • Option A: Stay narrow — omit the dry_run dispatch input and rely on repo vars for MAX_LABELS. Simpler templates; adopters who want dry_run add it themselves.
  • Option B: Mirror the v2.1.0 reference exactly — expose dry_run on dispatch, surface MAX_LABELS via repo vars in the env block. Richer out-of-the-box experience; matches what the wiki documents.

Recommendation: Option B, to keep the templates and the wiki walkthrough in lockstep. The capability matrix README should document the choice either way.

References

Checklist for the refresh PR

  • Repin all uses: to 98f1d9b85686147ffdc155547cdd3469546c4e26 # v2.1.0
  • Apply ALLOW_FAILURE polarity flip and fromJSON() wrap in both predict templates (and the new discussions template)
  • Update labeler-train.yml type options to All/Issues/Discussions/Pull Requests with per-type threshold env
  • Add download-discussions, check-*-data, train-discussions, test-discussions jobs to labeler-train.yml
  • Add discussions input + promote-discussions job to labeler-promote.yml
  • Switch fork guard to !github.event.repository.fork in all predict workflows + cache-retention
  • Add issues: write permission to labeler-predict-pulls.yml
  • Add "discussions" to labeler-cache-retention.yml matrix
  • Add labeler-predict-discussions.yml + .properties.json
  • Resolve "imported from" header comments per Governance section
  • Add a capability-matrix README
  • Decide and document the dry_run / max_labels exposure question

This issue was opened in dotnet/issue-labeler rather than dotnet/.github because the source-of-truth content lives here. Maintainers of dotnet/.github should reference this issue when planning the corresponding template-refresh PR in that repository.

Metadata

Metadata

Assignees

No one assigned

    Labels

    area-WorkflowsRelated to the issue-labeler reusable workflows

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions