Skip to content

Commit c7cf9d6

Browse files
committed
docs(rfc): document gpu field replacement
Signed-off-by: Evan Lezar <elezar@nvidia.com>
1 parent da6fbd8 commit c7cf9d6

1 file changed

Lines changed: 25 additions & 8 deletions

File tree

  • rfc/0004-sandbox-resource-requirements

rfc/0004-sandbox-resource-requirements/README.md

Lines changed: 25 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -69,8 +69,10 @@ tracked separately in issue #1492.
6969
- Defining the general driver-specific configuration passthrough API. Issue
7070
#1492 tracks that related API surface.
7171
- Publishing allocated resource identities in sandbox status.
72-
- Preserving long-term compatibility for `gpu`, `gpu_device`, or a
73-
GPU-specific `gpu_count` request field.
72+
- Preserving alpha-era compatibility for `gpu`, `gpu_device`, or a
73+
GPU-specific `gpu_count` request field. The legacy GPU-specific request
74+
fields are intentionally not carried forward into the API shape this RFC
75+
aims to stabilize.
7476

7577
## Proposal
7678

@@ -89,13 +91,22 @@ message SandboxSpec {
8991
9092
// Portable resource requirements used by the gateway for driver selection
9193
// and by drivers for provisioning.
92-
SandboxResourceRequirements resource_requirements = 11;
94+
SandboxResourceRequirements resource_requirements = 9;
9395
94-
reserved 9, 10;
95-
reserved "gpu", "gpu_device";
96+
reserved 10;
97+
reserved "gpu_device";
9698
}
9799
```
98100

101+
The public sandbox API is still alpha. This migration intentionally replaces
102+
the old `bool gpu = 9` field with the typed `resource_requirements = 9` message
103+
instead of reserving the legacy field number. Old live requests and persisted
104+
sandbox records that encode GPU intent through the legacy boolean are not
105+
migrated; callers should use a matching OpenShell CLI/API version and recreate
106+
GPU sandboxes after upgrade when they need the new typed shape. Avoiding
107+
alpha-era reserved fields keeps the proto surface closer to the API intended
108+
for stabilization.
109+
99110
`SandboxTemplate.resources` keeps its existing role as platform-native workload
100111
configuration. It may contain Kubernetes-style CPU, memory, and extended
101112
resource requests and limits, but it is not the portable resource contract.
@@ -551,17 +562,23 @@ message DriverSandboxSpec {
551562
string log_level = 1;
552563
map<string, string> environment = 5;
553564
DriverSandboxTemplate template = 6;
554-
DriverSandboxResourceRequirements resource_requirements = 11;
565+
DriverSandboxResourceRequirements resource_requirements = 9;
555566
556-
reserved 9, 10;
557-
reserved "gpu", "gpu_device";
567+
reserved 10;
568+
reserved "gpu_device";
558569
}
559570
```
560571

561572
Driver-owned resource requirement messages should have the same semantics as
562573
the public messages, but live in `compute_driver.proto` to keep the public and
563574
internal contracts separated.
564575

576+
The compute-driver API is version-coupled to the gateway in current deployments:
577+
local drivers are launched by the gateway at startup, and the driver proto is
578+
not treated as a public compatibility surface. It follows the same alpha-era
579+
field replacement as the public API rather than preserving transitional GPU
580+
fields.
581+
565582
### Driver capabilities
566583

567584
Replace GPU-specific capability fields with coarse resource capability

0 commit comments

Comments
 (0)