From 9e5dfe00e8732f09962d63c0d5e2e88f1c0d8a84 Mon Sep 17 00:00:00 2001 From: ppippi Date: Mon, 11 Aug 2025 00:26:45 +0900 Subject: [PATCH 1/6] =?UTF-8?q?=EB=B2=88=EC=97=AD=20workflow=20=EC=9E=90?= =?UTF-8?q?=EB=8F=99=ED=99=94?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- _posts/2025-07-03-actions-runner-controller.md | 1 + 1 file changed, 1 insertion(+) diff --git a/_posts/2025-07-03-actions-runner-controller.md b/_posts/2025-07-03-actions-runner-controller.md index 6dec20a..d7a918f 100644 --- a/_posts/2025-07-03-actions-runner-controller.md +++ b/_posts/2025-07-03-actions-runner-controller.md @@ -285,3 +285,4 @@ docker run -it \ ARC를 사용하면 GitHub에서 제공하는 Runner를 사용할 때의 비싼 비용 문제와, 직접 VM을 관리하며 Runner를 운영할 때의 비효율성을 모두 해결할 수 있습니다. 특히 GPU가 필요하거나, 복잡한 의존성을 가진 MLOps CI/CD 환경을 구축할 때 ARC는 매우 강력한 도구가 됩니다. 초기 설정 과정이 다소 복잡하게 느껴질 수 있지만, 한번 구축해두면 CI/CD 비용을 크게 절감하고 운영 부담을 덜어주므로 MLOps를 고민하고 있다면 꼭 한번 도입을 검토해보시길 바랍니다. + From 98bbcd07e31c7dde5d52eba414d5c60ae856ba0f Mon Sep 17 00:00:00 2001 From: ppippi Date: Mon, 11 Aug 2025 00:39:13 +0900 Subject: [PATCH 2/6] =?UTF-8?q?=EB=B2=88=EC=97=AD=20workflow=20=EC=9E=90?= =?UTF-8?q?=EB=8F=99=ED=99=94?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- .github/workflows/translate-to-english.yml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/.github/workflows/translate-to-english.yml b/.github/workflows/translate-to-english.yml index f6c126d..3910693 100644 --- a/.github/workflows/translate-to-english.yml +++ b/.github/workflows/translate-to-english.yml @@ -70,6 +70,6 @@ jobs: uses: stefanzweifel/git-auto-commit-action@v5 with: commit_message: "chore: add English translations for PR #${{ github.event.pull_request.number }}" - file_pattern: "_posts_en/**/*.md" + file_pattern: _posts_en/*.md From 655c87f039e984454653e7e7ccd6b4cf90a8e9e1 Mon Sep 17 00:00:00 2001 From: ppippi-dev <61408680+ppippi-dev@users.noreply.github.com> Date: Sun, 10 Aug 2025 15:41:24 +0000 Subject: [PATCH 3/6] chore: add English translations for PR #3 --- ...03-setting-up-actions-runner-controller.md | 273 ++++++++++++++++++ 1 file changed, 273 insertions(+) create mode 100644 _posts_en/2025-07-03-setting-up-actions-runner-controller.md diff --git a/_posts_en/2025-07-03-setting-up-actions-runner-controller.md b/_posts_en/2025-07-03-setting-up-actions-runner-controller.md new file mode 100644 index 0000000..181e409 --- /dev/null +++ b/_posts_en/2025-07-03-setting-up-actions-runner-controller.md @@ -0,0 +1,273 @@ +--- +feature-img: assets/img/2025-07-03/0.png +layout: post +subtitle: Building an MLOps CI Environment +tags: +- MLOps +- Infra +title: Setting Up Actions Runner Controller +--- + + +### Intro + +As I’ve been enjoying building with AI lately, the importance of a solid test environment has become even clearer. + +The most common approach is to build CI with GitHub Actions, but in MLOps you often need high-spec instances for CI. + +GitHub Actions does offer [GPU instances (Linux, 4 cores)](https://docs.github.com/ko/billing/managing-billing-for-your-products/about-billing-for-github-actions), but at $0.07 per minute as of now, they’re quite expensive to use. + +They’re also limited to NVIDIA T4 GPUs, which can be restrictive as model sizes keep growing. + +A good alternative in this situation is a self-hosted runner. + +As the name suggests, you configure the runner yourself and execute GitHub workflows on it. + +You can set this up via GitHub’s guide: [Add self-hosted runners](https://docs.github.com/ko/actions/how-tos/hosting-your-own-runners/managing-self-hosted-runners/adding-self-hosted-runners). + +However, this approach requires keeping the CI machine always on (online), which can be inefficient if CI/CD jobs are infrequent. + +This is where the Actions Runner Controller (ARC) shines as a great alternative. + +[Actions Runner Controller](https://github.com/actions/actions-runner-controller) is an open-source controller that lets GitHub Actions runners operate in a Kubernetes environment. + +With it, your Kubernetes resources are used for CI only when a GitHub Actions workflow runs. + + +### Installing Actions Runner Controller + +ARC installation has two major steps. +1. Create a GitHub Personal Access Token for communication/authentication with GitHub +2. Install ARC via Helm and authenticate using the token + +#### 1. Create a GitHub Personal Access Token + +ARC needs authentication to interact with the GitHub API to register and manage runners. Create a Personal Access Token (PAT). + +- Path: Settings > Developer settings > Personal access tokens > Tokens (classic) > Generate new token + +When creating the Personal Access Token, select the [appropriate permissions](https://github.com/actions/actions-runner-controller/blob/master/docs/authenticating-to-the-github-api.md#deploying-using-pat-authentication). (For convenience here, grant full permissions.) + +> For security, use least privilege and set an expiration. + +It’s generally recommended to authenticate via a GitHub App rather than PAT. + +Keep the PAT safe—you’ll need it when installing ARC. + +#### 2. Install ARC with Helm + +Before installing ARC, you need cert-manager. If it’s not already set up in the cluster, install it: + +```bash +kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.8.2/cert-manager.yaml +``` + +Now install ARC into your Kubernetes cluster using Helm. + +Using the Personal Access Token you created earlier, install ARC. Replace YOUR_GITHUB_TOKEN below with your PAT. + +```bash +helm repo add actions-runner-controller https://actions-runner-controller.github.io/actions-runner-controller + +helm repo update + +helm pull actions-runner-controller/actions-runner-controller + +tar -zxvf actions-runner-controller-*.tgz + +export GITHUB_TOKEN=YOUR_GITHUB_TOKEN + +helm upgrade --install actions-runner-controller ./actions-runner-controller \ + --namespace actions-runner-system \ + --create-namespace \ + --set authSecret.create=true \ + --set authSecret.github_token="${GITHUB_TOKEN}" +``` + +After installation, verify that the ARC controller is running: + +```bash +kubectl get pods -n actions-runner-system +``` + +You should see the controller manager pod running in the actions-runner-system namespace. + +ARC is now ready to communicate with GitHub. Next, define the runners that will actually execute workflows. + +### 3. Configure the Runner + +The ARC controller is installed, but there are no runners yet. Now we’ll create runner pods according to GitHub Actions jobs. + +We’ll use two resources: +1. RunnerDeployment: Acts as a template for runner pods. It defines which container image to use, which GitHub repo to connect to, labels, etc. +2. HorizontalRunnerAutoscaler (HRA): Watches the RunnerDeployment and automatically adjusts the number of replicas based on the number of queued jobs on GitHub. + +#### Define a RunnerDeployment + +First, create a file named runner-deployment.yml as below. Change spec.template.spec.repository to your GitHub repository. + +> Besides a repository, you can also target an organization if you have the permissions. + +```yaml +apiVersion: actions.summerwind.dev/v1alpha1 +kind: RunnerDeployment +metadata: + name: example-runner-deployment + namespace: actions-runner-system +spec: + replicas: 1 + template: + spec: + repository: / + labels: + - self-hosted + - arc-runner +``` + +With this in place, you can find your self-hosted runner under your repo’s Actions settings. + + + +After the deployment completes, you’ll see a new runner with the self-hosted and arc-runner labels under Settings > Actions > Runners in your GitHub repository. + + +#### Define a HorizontalRunnerAutoscaler + +Next, define an HRA to autoscale the RunnerDeployment you created above. Create hra.yml. + +```yaml +apiVersion: actions.summerwind.dev/v1alpha1 +kind: HorizontalRunnerAutoscaler +metadata: + name: example-hra + namespace: actions-runner-system +spec: + scaleTargetRef: + name: example-runner-deployment + minReplicas: 0 + maxReplicas: 5 +``` + +By setting minReplicas and maxReplicas, you can scale up and down according to your resources. + +You can also specify additional metrics to create pods whenever a workflow is triggered. There are several metrics available. + +> When using HorizontalRunnerAutoscaler, runners are created only when needed. When the count is zero, you won’t see runners in the GitHub UI. + + + +```yaml +apiVersion: actions.summerwind.dev/v1alpha1 +kind: HorizontalRunnerAutoscaler +metadata: + name: example-hra + namespace: actions-runner-system +spec: + scaleTargetRef: + name: example-runner-deployment + minReplicas: 0 + maxReplicas: 5 + metrics: + - type: TotalNumberOfQueuedAndInProgressWorkflowRuns + repositoryNames: ["/"] + +The above is my preferred metric—it scales up when workflow runs are queued. As shown, choose metrics as needed to get the behavior you want. + + +### 4. Use it in a GitHub Actions workflow + +All set! Using the new ARC runner is simple. In your workflow file, put the labels from the RunnerDeployment in the runs-on key. + +Add a simple test workflow (test-arc.yml) under .github/workflows/: + +```yaml +name: ARC Runner Test + +on: + push: + branches: + - main + +jobs: + test-job: + runs-on: [self-hosted, arc-runner] + steps: + - name: Checkout code + uses: actions/checkout@v3 + + - name: Test + run: | + echo "Hello from an ARC runner!" + echo "This runner is running inside a Kubernetes pod." + sleep 10 +``` + +The key part is runs-on: [self-hosted, arc-runner]. When this workflow runs, GitHub assigns the job to a runner that has both labels. ARC detects the event and, based on the HRA settings, creates a new runner pod if needed to handle the job. + +> With self-hosted runners, unlike GitHub-hosted runners, you may need to install some packages within the workflow. + +### Troubleshooting notes + +I often use Docker for CI/CD, and one recurring issue is DinD (Docker-in-Docker). + +With ARC, by default the scheduling runner container and a docker daemon container (docker) run as sidecars. + +To handle this, there’s a DinD-enabled runner image. + +In the YAML below, set image and dockerdWithinRunnerContainer to run the docker daemon inside the runner; your workflow then executes on that runner. + +```yaml +apiVersion: actions.summerwind.dev/v1alpha1 +kind: RunnerDeployment +metadata: + name: example-runner-deployment + namespace: actions-runner-system +spec: + replicas: 1 + template: + spec: + repository: / + labels: + - self-hosted + - arc-runner + image: "summerwind/actions-runner-dind:latest" + dockerdWithinRunnerContainer: true +``` + +For Docker tests that need GPUs, if you use the DinD image above on a cluster with NVIDIA Container Toolkit installed, the GPU will be recognized. + +In the workflow you want to run, configure as below to confirm GPUs are available even under DinD. (Be sure to check the versions of NVIDIA Container Toolkit and the NVIDIA GPU Driver plugin!) + +```bash +# Check GPU devices +ls -la /dev/nvidia* + +# device library setup +smi_path=$(find / -name "nvidia-smi" 2>/dev/null | head -n 1) +lib_path=$(find / -name "libnvidia-ml.so" 2>/dev/null | head -n 1) +lib_dir=$(dirname "$lib_path") +export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$(dirname "$lib_path") +export NVIDIA_VISIBLE_DEVICES=all +export NVIDIA_DRIVER_CAPABILITIES=compute,utility + +# Mount GPU devices and libraries directly without the nvidia runtime +docker run -it \ + --device=/dev/nvidia0:/dev/nvidia0 \ + --device=/dev/nvidiactl:/dev/nvidiactl \ + --device=/dev/nvidia-uvm:/dev/nvidia-uvm \ + --device=/dev/nvidia-uvm-tools:/dev/nvidia-uvm-tools \ + -v "$lib_dir:$lib_dir:ro" \ + -v "$(dirname $smi_path):$(dirname $smi_path):ro" \ + -e LD_LIBRARY_PATH="$LD_LIBRARY_PATH" \ + -e NVIDIA_VISIBLE_DEVICES="$NVIDIA_VISIBLE_DEVICES" \ + -e NVIDIA_DRIVER_CAPABILITIES="$NVIDIA_DRIVER_CAPABILITIES" \ + pytorch/pytorch:2.6.0-cuda12.4-cudnn9-runtime +``` + +### Wrapping up + +We walked through how to set up Actions Runner Controller in a Kubernetes environment to build a dynamically scalable self-hosted runner setup. + +ARC helps avoid the high costs of GitHub-hosted runners and the inefficiencies of managing your own VMs. It’s especially powerful for MLOps CI/CD environments that need GPUs or have complex dependencies. + +While the initial setup may feel a bit involved, once it’s in place it can significantly cut CI/CD costs and reduce operational burden. If you’re considering MLOps, it’s definitely worth a look. \ No newline at end of file From 2f70e29339b73e6d3793683f2cf2b8ad44419d44 Mon Sep 17 00:00:00 2001 From: ppippi Date: Mon, 11 Aug 2025 00:56:24 +0900 Subject: [PATCH 4/6] =?UTF-8?q?=EB=B2=88=EC=97=AD=20workflow=20=EC=9E=90?= =?UTF-8?q?=EB=8F=99=ED=99=94?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- .github/workflows/jekyll-docker.yml | 11 ++++------- .github/workflows/translate-to-english.yml | 9 ++++----- 2 files changed, 8 insertions(+), 12 deletions(-) diff --git a/.github/workflows/jekyll-docker.yml b/.github/workflows/jekyll-docker.yml index c35cdc0..d3d54f1 100644 --- a/.github/workflows/jekyll-docker.yml +++ b/.github/workflows/jekyll-docker.yml @@ -1,11 +1,8 @@ name: Build and Deploy to GitHub Pages on: - push: - branches: [ "main" ] - workflow_run: - workflows: ["Translate new posts to English"] - types: [completed] + pull_request: + types: [closed] # GitHub Pages에 배포하기 위한 권한 설정 permissions: @@ -20,7 +17,7 @@ concurrency: jobs: build: - if: ${{ github.event_name == 'push' || (github.event_name == 'workflow_run' && github.event.workflow_run.conclusion == 'success') }} + if: ${{ github.event.pull_request.merged == true && github.event.pull_request.base.ref == 'main' }} runs-on: ubuntu-latest steps: @@ -46,7 +43,7 @@ jobs: url: ${{ steps.deployment.outputs.page_url }} runs-on: ubuntu-latest needs: build - if: ${{ github.event_name == 'push' || (github.event_name == 'workflow_run' && github.event.workflow_run.conclusion == 'success') }} + if: ${{ github.event.pull_request.merged == true && github.event.pull_request.base.ref == 'main' }} steps: - name: Deploy to GitHub Pages diff --git a/.github/workflows/translate-to-english.yml b/.github/workflows/translate-to-english.yml index 3910693..db166b2 100644 --- a/.github/workflows/translate-to-english.yml +++ b/.github/workflows/translate-to-english.yml @@ -1,21 +1,20 @@ name: Translate new posts to English on: - pull_request_target: - types: [closed] + pull_request: + types: [opened, synchronize, reopened] permissions: contents: write jobs: translate: - if: github.event.pull_request.merged == true && github.event.pull_request.base.ref == 'main' runs-on: ubuntu-latest steps: - - name: Checkout main + - name: Checkout PR branch uses: actions/checkout@v4 with: - ref: main + ref: ${{ github.head_ref }} - name: Get changed files from PR id: changes From 18a12f698675d2c0fab42d0fa4537446509d4a48 Mon Sep 17 00:00:00 2001 From: ppippi Date: Mon, 11 Aug 2025 00:57:53 +0900 Subject: [PATCH 5/6] =?UTF-8?q?=EB=B2=88=EC=97=AD=20workflow=20=EC=9E=90?= =?UTF-8?q?=EB=8F=99=ED=99=94?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- ...03-setting-up-actions-runner-controller.md | 273 ------------------ 1 file changed, 273 deletions(-) delete mode 100644 _posts_en/2025-07-03-setting-up-actions-runner-controller.md diff --git a/_posts_en/2025-07-03-setting-up-actions-runner-controller.md b/_posts_en/2025-07-03-setting-up-actions-runner-controller.md deleted file mode 100644 index 181e409..0000000 --- a/_posts_en/2025-07-03-setting-up-actions-runner-controller.md +++ /dev/null @@ -1,273 +0,0 @@ ---- -feature-img: assets/img/2025-07-03/0.png -layout: post -subtitle: Building an MLOps CI Environment -tags: -- MLOps -- Infra -title: Setting Up Actions Runner Controller ---- - - -### Intro - -As I’ve been enjoying building with AI lately, the importance of a solid test environment has become even clearer. - -The most common approach is to build CI with GitHub Actions, but in MLOps you often need high-spec instances for CI. - -GitHub Actions does offer [GPU instances (Linux, 4 cores)](https://docs.github.com/ko/billing/managing-billing-for-your-products/about-billing-for-github-actions), but at $0.07 per minute as of now, they’re quite expensive to use. - -They’re also limited to NVIDIA T4 GPUs, which can be restrictive as model sizes keep growing. - -A good alternative in this situation is a self-hosted runner. - -As the name suggests, you configure the runner yourself and execute GitHub workflows on it. - -You can set this up via GitHub’s guide: [Add self-hosted runners](https://docs.github.com/ko/actions/how-tos/hosting-your-own-runners/managing-self-hosted-runners/adding-self-hosted-runners). - -However, this approach requires keeping the CI machine always on (online), which can be inefficient if CI/CD jobs are infrequent. - -This is where the Actions Runner Controller (ARC) shines as a great alternative. - -[Actions Runner Controller](https://github.com/actions/actions-runner-controller) is an open-source controller that lets GitHub Actions runners operate in a Kubernetes environment. - -With it, your Kubernetes resources are used for CI only when a GitHub Actions workflow runs. - - -### Installing Actions Runner Controller - -ARC installation has two major steps. -1. Create a GitHub Personal Access Token for communication/authentication with GitHub -2. Install ARC via Helm and authenticate using the token - -#### 1. Create a GitHub Personal Access Token - -ARC needs authentication to interact with the GitHub API to register and manage runners. Create a Personal Access Token (PAT). - -- Path: Settings > Developer settings > Personal access tokens > Tokens (classic) > Generate new token - -When creating the Personal Access Token, select the [appropriate permissions](https://github.com/actions/actions-runner-controller/blob/master/docs/authenticating-to-the-github-api.md#deploying-using-pat-authentication). (For convenience here, grant full permissions.) - -> For security, use least privilege and set an expiration. - -It’s generally recommended to authenticate via a GitHub App rather than PAT. - -Keep the PAT safe—you’ll need it when installing ARC. - -#### 2. Install ARC with Helm - -Before installing ARC, you need cert-manager. If it’s not already set up in the cluster, install it: - -```bash -kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.8.2/cert-manager.yaml -``` - -Now install ARC into your Kubernetes cluster using Helm. - -Using the Personal Access Token you created earlier, install ARC. Replace YOUR_GITHUB_TOKEN below with your PAT. - -```bash -helm repo add actions-runner-controller https://actions-runner-controller.github.io/actions-runner-controller - -helm repo update - -helm pull actions-runner-controller/actions-runner-controller - -tar -zxvf actions-runner-controller-*.tgz - -export GITHUB_TOKEN=YOUR_GITHUB_TOKEN - -helm upgrade --install actions-runner-controller ./actions-runner-controller \ - --namespace actions-runner-system \ - --create-namespace \ - --set authSecret.create=true \ - --set authSecret.github_token="${GITHUB_TOKEN}" -``` - -After installation, verify that the ARC controller is running: - -```bash -kubectl get pods -n actions-runner-system -``` - -You should see the controller manager pod running in the actions-runner-system namespace. - -ARC is now ready to communicate with GitHub. Next, define the runners that will actually execute workflows. - -### 3. Configure the Runner - -The ARC controller is installed, but there are no runners yet. Now we’ll create runner pods according to GitHub Actions jobs. - -We’ll use two resources: -1. RunnerDeployment: Acts as a template for runner pods. It defines which container image to use, which GitHub repo to connect to, labels, etc. -2. HorizontalRunnerAutoscaler (HRA): Watches the RunnerDeployment and automatically adjusts the number of replicas based on the number of queued jobs on GitHub. - -#### Define a RunnerDeployment - -First, create a file named runner-deployment.yml as below. Change spec.template.spec.repository to your GitHub repository. - -> Besides a repository, you can also target an organization if you have the permissions. - -```yaml -apiVersion: actions.summerwind.dev/v1alpha1 -kind: RunnerDeployment -metadata: - name: example-runner-deployment - namespace: actions-runner-system -spec: - replicas: 1 - template: - spec: - repository: / - labels: - - self-hosted - - arc-runner -``` - -With this in place, you can find your self-hosted runner under your repo’s Actions settings. - - - -After the deployment completes, you’ll see a new runner with the self-hosted and arc-runner labels under Settings > Actions > Runners in your GitHub repository. - - -#### Define a HorizontalRunnerAutoscaler - -Next, define an HRA to autoscale the RunnerDeployment you created above. Create hra.yml. - -```yaml -apiVersion: actions.summerwind.dev/v1alpha1 -kind: HorizontalRunnerAutoscaler -metadata: - name: example-hra - namespace: actions-runner-system -spec: - scaleTargetRef: - name: example-runner-deployment - minReplicas: 0 - maxReplicas: 5 -``` - -By setting minReplicas and maxReplicas, you can scale up and down according to your resources. - -You can also specify additional metrics to create pods whenever a workflow is triggered. There are several metrics available. - -> When using HorizontalRunnerAutoscaler, runners are created only when needed. When the count is zero, you won’t see runners in the GitHub UI. - - - -```yaml -apiVersion: actions.summerwind.dev/v1alpha1 -kind: HorizontalRunnerAutoscaler -metadata: - name: example-hra - namespace: actions-runner-system -spec: - scaleTargetRef: - name: example-runner-deployment - minReplicas: 0 - maxReplicas: 5 - metrics: - - type: TotalNumberOfQueuedAndInProgressWorkflowRuns - repositoryNames: ["/"] - -The above is my preferred metric—it scales up when workflow runs are queued. As shown, choose metrics as needed to get the behavior you want. - - -### 4. Use it in a GitHub Actions workflow - -All set! Using the new ARC runner is simple. In your workflow file, put the labels from the RunnerDeployment in the runs-on key. - -Add a simple test workflow (test-arc.yml) under .github/workflows/: - -```yaml -name: ARC Runner Test - -on: - push: - branches: - - main - -jobs: - test-job: - runs-on: [self-hosted, arc-runner] - steps: - - name: Checkout code - uses: actions/checkout@v3 - - - name: Test - run: | - echo "Hello from an ARC runner!" - echo "This runner is running inside a Kubernetes pod." - sleep 10 -``` - -The key part is runs-on: [self-hosted, arc-runner]. When this workflow runs, GitHub assigns the job to a runner that has both labels. ARC detects the event and, based on the HRA settings, creates a new runner pod if needed to handle the job. - -> With self-hosted runners, unlike GitHub-hosted runners, you may need to install some packages within the workflow. - -### Troubleshooting notes - -I often use Docker for CI/CD, and one recurring issue is DinD (Docker-in-Docker). - -With ARC, by default the scheduling runner container and a docker daemon container (docker) run as sidecars. - -To handle this, there’s a DinD-enabled runner image. - -In the YAML below, set image and dockerdWithinRunnerContainer to run the docker daemon inside the runner; your workflow then executes on that runner. - -```yaml -apiVersion: actions.summerwind.dev/v1alpha1 -kind: RunnerDeployment -metadata: - name: example-runner-deployment - namespace: actions-runner-system -spec: - replicas: 1 - template: - spec: - repository: / - labels: - - self-hosted - - arc-runner - image: "summerwind/actions-runner-dind:latest" - dockerdWithinRunnerContainer: true -``` - -For Docker tests that need GPUs, if you use the DinD image above on a cluster with NVIDIA Container Toolkit installed, the GPU will be recognized. - -In the workflow you want to run, configure as below to confirm GPUs are available even under DinD. (Be sure to check the versions of NVIDIA Container Toolkit and the NVIDIA GPU Driver plugin!) - -```bash -# Check GPU devices -ls -la /dev/nvidia* - -# device library setup -smi_path=$(find / -name "nvidia-smi" 2>/dev/null | head -n 1) -lib_path=$(find / -name "libnvidia-ml.so" 2>/dev/null | head -n 1) -lib_dir=$(dirname "$lib_path") -export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$(dirname "$lib_path") -export NVIDIA_VISIBLE_DEVICES=all -export NVIDIA_DRIVER_CAPABILITIES=compute,utility - -# Mount GPU devices and libraries directly without the nvidia runtime -docker run -it \ - --device=/dev/nvidia0:/dev/nvidia0 \ - --device=/dev/nvidiactl:/dev/nvidiactl \ - --device=/dev/nvidia-uvm:/dev/nvidia-uvm \ - --device=/dev/nvidia-uvm-tools:/dev/nvidia-uvm-tools \ - -v "$lib_dir:$lib_dir:ro" \ - -v "$(dirname $smi_path):$(dirname $smi_path):ro" \ - -e LD_LIBRARY_PATH="$LD_LIBRARY_PATH" \ - -e NVIDIA_VISIBLE_DEVICES="$NVIDIA_VISIBLE_DEVICES" \ - -e NVIDIA_DRIVER_CAPABILITIES="$NVIDIA_DRIVER_CAPABILITIES" \ - pytorch/pytorch:2.6.0-cuda12.4-cudnn9-runtime -``` - -### Wrapping up - -We walked through how to set up Actions Runner Controller in a Kubernetes environment to build a dynamically scalable self-hosted runner setup. - -ARC helps avoid the high costs of GitHub-hosted runners and the inefficiencies of managing your own VMs. It’s especially powerful for MLOps CI/CD environments that need GPUs or have complex dependencies. - -While the initial setup may feel a bit involved, once it’s in place it can significantly cut CI/CD costs and reduce operational burden. If you’re considering MLOps, it’s definitely worth a look. \ No newline at end of file From eb4e7a9cfc76e2ee9c01afe0e49f31fcedb60533 Mon Sep 17 00:00:00 2001 From: ppippi Date: Mon, 11 Aug 2025 00:59:38 +0900 Subject: [PATCH 6/6] =?UTF-8?q?=EB=B2=88=EC=97=AD=20workflow=20=EC=9E=90?= =?UTF-8?q?=EB=8F=99=ED=99=94?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- _posts/2025-07-03-actions-runner-controller.md | 1 - 1 file changed, 1 deletion(-) diff --git a/_posts/2025-07-03-actions-runner-controller.md b/_posts/2025-07-03-actions-runner-controller.md index d7a918f..6dec20a 100644 --- a/_posts/2025-07-03-actions-runner-controller.md +++ b/_posts/2025-07-03-actions-runner-controller.md @@ -285,4 +285,3 @@ docker run -it \ ARC를 사용하면 GitHub에서 제공하는 Runner를 사용할 때의 비싼 비용 문제와, 직접 VM을 관리하며 Runner를 운영할 때의 비효율성을 모두 해결할 수 있습니다. 특히 GPU가 필요하거나, 복잡한 의존성을 가진 MLOps CI/CD 환경을 구축할 때 ARC는 매우 강력한 도구가 됩니다. 초기 설정 과정이 다소 복잡하게 느껴질 수 있지만, 한번 구축해두면 CI/CD 비용을 크게 절감하고 운영 부담을 덜어주므로 MLOps를 고민하고 있다면 꼭 한번 도입을 검토해보시길 바랍니다. -