Scheduling queues are the core resource management primitive in KAI Scheduler, providing hierarchical resource allocation with quota guarantees and priority-based distribution.
Currently only queues that have parent queues can be scheduled to and top level queues cannot be used by jobs. This is a limitation that will be fixed in the future #621.
| Attribute | Description | Units |
|---|---|---|
| Quota | Guaranteed resource allocation | CPU: millicores, Memory: MB, GPU: units |
| Over-Quota Priority | Resource allocation order when exceeding quota | Integer (higher = first) |
| Over-Quota Weight | Resource distribution weight within priority level | Integer |
| Limit | Hard cap on resource consumption | Same as quota |
apiVersion: scheduling.run.ai/v2
kind: Queue
metadata:
name: example-queue
spec:
displayName: "Example Queue" # Optional: logging purposes
parentQueue: "parent-queue" # Optional: hierarchical structure
priority: 100 # Optional: allocation precedence
resources:
cpu: ResourceQuota
memory: ResourceQuota
gpu: ResourceQuotaresources:
cpu:
quota: 2000 # 2 CPU cores guaranteed
overQuotaWeight: 1 # Distribution weight
limit: 4000 # Max 4 CPU cores
memory:
quota: 4096 # 4GB guaranteed
overQuotaWeight: 1
limit: 8192 # Max 8GB
gpu:
quota: 2 # 2 GPUs guaranteed
overQuotaWeight: 1
limit: 4 # Max 4 GPUs| Field | Value | Behavior |
|---|---|---|
quota |
-1 |
Unlimited quota |
quota |
0 or unset |
No guaranteed resources (default) |
limit |
-1 |
No limit |
limit |
0 or unset |
No additional resources allowed (default) |
- CPU: Millicores (1000 = 1 CPU core)
- Memory: Megabytes (MB = 10⁶ bytes)
- GPU: Units (1 = full GPU device)
apiVersion: scheduling.run.ai/v2
kind: Queue
metadata:
name: research-team
spec:
displayName: "Research Team"
resources:
cpu:
quota: 1000
limit: 2000
gpu:
quota: 1
limit: 2apiVersion: scheduling.run.ai/v2
kind: Queue
metadata:
name: ml-team
spec:
displayName: "ML Team"
parentQueue: "research-team"
priority: 200
resources:
cpu:
quota: 500
overQuotaWeight: 2
gpu:
quota: 1
overQuotaWeight: 1apiVersion: scheduling.run.ai/v2
kind: Queue
metadata:
name: burst-queue
spec:
resources:
cpu:
quota: -1 # Unlimited quota
limit: -1 # No limit
gpu:
quota: 0 # No guarantee
limit: -1 # No limit