Add configurable imagePullPolicy for Kubernetes Job Runner
Request Type
Feature Request
Work Environment
| Question |
Answer |
| OS version (server) |
Kubernetes-based deployment |
| Cortex version / git hash |
Latest / main branch |
| Package Type |
Docker, Kubernetes |
| Deployment Environment |
Private/On-premise with Harbor registry |
Problem Description
When deploying Cortex analyzers in a Kubernetes environment with a private image registry (such as Harbor), the current implementation of K8sJobRunnerSrv.scala does not provide a way to configure the imagePullPolicy for the Kubernetes Jobs it creates.
This causes issues in the following scenarios:
-
Private Registry Deployments: When new analyzer versions are pushed to a private registry with the same tag (e.g., latest, stable, or environment-specific tags like dev, staging), Kubernetes will not pull the updated image if the local node already has an image with that tag cached.
-
Development/Testing Environments: In rapidly iterating development environments, developers frequently push updated analyzer images with the same tag. Without the ability to set imagePullPolicy: Always, these updates are not picked up automatically.
-
CI/CD Pipeline Integration: Modern CI/CD pipelines (e.g., GitLab CI with Harbor) often use consistent tagging strategies (like ${OWNER}-latest or build-${COMMIT_SHA}). The inability to force image pulls can lead to stale analyzer versions being executed.
Current Behavior
The K8sJobRunnerSrv.scala creates Kubernetes Jobs without specifying an imagePullPolicy, which defaults to:
IfNotPresent - Only pulls if the image doesn't exist locally
- This prevents automatic updates when new images are pushed to the registry with existing tags
Desired Behavior
Add a configurable imagePullPolicy parameter that:
- Can be set via configuration file (
application.conf or reference.conf)
- Defaults to
IfNotPresent for backward compatibility
- Can be overridden to
Always, IfNotPresent, or Never as needed
- Is applied to the Kubernetes Job container specification
- Can be configured via Helm chart values for Kubernetes deployments
Proposed Solution
1. Modify K8sJobRunnerSrv.scala:
Add an imagePullPolicy parameter to the class constructor and apply it to the Kubernetes Job container spec:
@Singleton
class K8sJobRunnerSrv(
client: DefaultKubernetesClient,
jobBaseDirectory: Path,
persistentVolumeClaimName: Option[String],
imagePullPolicy: String, // Add this parameter
implicit val system: ActorSystem
) {
@Inject()
def this(config: Configuration, system: ActorSystem) =
this(
new DefaultKubernetesClient(),
Paths.get(config.get[String]("job.directory")),
config.getOptional[String]("job.kubernetes.persistentVolumeClaimName"),
config.getOptional[String]("job.kubernetes.imagePullPolicy").getOrElse("IfNotPresent"), // Add this line
system: ActorSystem
)
Apply the policy in the run method:
.addNewContainer()
.withName("neuron")
.withImage(dockerImage)
.withImagePullPolicy(imagePullPolicy) // Add this line
.withArgs("/job")
2. Update conf/reference.conf:
job {
timeout = 30 minutes
runners = [kubernetes, docker, process]
directory = ${java.io.tmpdir}
dockerDirectory = ${job.directory}
keepJobFolder = false
kubernetes {
# Name of the PersistentVolumeClaim to use for job storage (required for k8s runner)
# persistentVolumeClaimName = "cortex-jobs-pvc"
# Image pull policy for Kubernetes jobs
# Options: Always, IfNotPresent, Never
# Default: IfNotPresent
# Set to "Always" for private registries with frequently updated images
imagePullPolicy = "IfNotPresent"
}
}
3. Helm Chart Integration (Optional):
For Kubernetes deployments, this can be exposed via Helm chart values.yaml:
cortex:
kubernetes:
persistentVolumeClaimName: "cortex-jobs-pvc"
imagePullPolicy: "Always" # or "IfNotPresent", "Never"
Benefits
- Private Registry Support: Enables proper operation with private registries (Harbor, ECR, ACR, GCR)
- Backward Compatible: Defaults to
IfNotPresent maintaining current behavior
- Flexible Deployment: Different policies can be used for dev/staging/production environments
- CI/CD Friendly: Supports modern continuous deployment workflows
- Industry Standard: Aligns with Kubernetes best practices and common patterns
Use Cases
- Development Environment: Set to
Always to ensure latest analyzer versions are always pulled
- Production Environment: Set to
IfNotPresent to reduce registry load and improve startup time
- Air-Gapped/Offline: Set to
Never to require pre-loaded images on all nodes
Implementation Status
This feature has been implemented in a fork and tested successfully with:
- Harbor private registry
- GitLab CI/CD pipeline
- LinCloud Kubernetes environment
Related Documentation
Complementary Information
Configuration Example for Private Registry:
job {
kubernetes {
persistentVolumeClaimName = "cortex-jobs-pvc"
imagePullPolicy = "Always"
}
}
Environment Variable Override:
JOB_KUBERNETES_IMAGEPULLPOLICY=Always
This enhancement would significantly improve Cortex's usability in enterprise and private cloud environments where private image registries are the norm.
Add configurable imagePullPolicy for Kubernetes Job Runner
Request Type
Feature Request
Work Environment
Problem Description
When deploying Cortex analyzers in a Kubernetes environment with a private image registry (such as Harbor), the current implementation of
K8sJobRunnerSrv.scaladoes not provide a way to configure theimagePullPolicyfor the Kubernetes Jobs it creates.This causes issues in the following scenarios:
Private Registry Deployments: When new analyzer versions are pushed to a private registry with the same tag (e.g.,
latest,stable, or environment-specific tags likedev,staging), Kubernetes will not pull the updated image if the local node already has an image with that tag cached.Development/Testing Environments: In rapidly iterating development environments, developers frequently push updated analyzer images with the same tag. Without the ability to set
imagePullPolicy: Always, these updates are not picked up automatically.CI/CD Pipeline Integration: Modern CI/CD pipelines (e.g., GitLab CI with Harbor) often use consistent tagging strategies (like
${OWNER}-latestorbuild-${COMMIT_SHA}). The inability to force image pulls can lead to stale analyzer versions being executed.Current Behavior
The
K8sJobRunnerSrv.scalacreates Kubernetes Jobs without specifying animagePullPolicy, which defaults to:IfNotPresent- Only pulls if the image doesn't exist locallyDesired Behavior
Add a configurable
imagePullPolicyparameter that:application.conforreference.conf)IfNotPresentfor backward compatibilityAlways,IfNotPresent, orNeveras neededProposed Solution
1. Modify
K8sJobRunnerSrv.scala:Add an
imagePullPolicyparameter to the class constructor and apply it to the Kubernetes Job container spec:Apply the policy in the
runmethod:2. Update
conf/reference.conf:3. Helm Chart Integration (Optional):
For Kubernetes deployments, this can be exposed via Helm chart
values.yaml:Benefits
IfNotPresentmaintaining current behaviorUse Cases
Alwaysto ensure latest analyzer versions are always pulledIfNotPresentto reduce registry load and improve startup timeNeverto require pre-loaded images on all nodesImplementation Status
This feature has been implemented in a fork and tested successfully with:
Related Documentation
Complementary Information
Configuration Example for Private Registry:
Environment Variable Override:
This enhancement would significantly improve Cortex's usability in enterprise and private cloud environments where private image registries are the norm.