-
Notifications
You must be signed in to change notification settings - Fork 276
Description
Problem Statement
Several inference providers use short-lived credentials that require periodic refresh: Google Vertex AI and Azure OpenAI with Entra ID use OAuth2 tokens (1-hour expiry), IBM watsonx uses IBM Cloud IAM tokens. OpenShell has no mechanism for credential lifecycle management — provider credentials are stored once at creation and never updated. Attempting to use these providers today fails silently after token expiry, with no workaround that doesn't require manual intervention.
This is not a Vertex-specific gap. Any provider that issues short-lived credentials hits the same missing primitive.
Proposed Design
Introduce a lightweight CredentialRefresher trait in the gateway, with Google Vertex AI as the first implementation. AWS Bedrock is explicitly out of scope — SigV4 is per-request signing, not a stored token, and requires a different approach.
1. CredentialRefresher trait (crates/openshell-server/src/providers/refresh.rs):
#[async_trait]
pub trait CredentialRefresher: Send + Sync {
fn provider_type(&self) -> &'static str;
fn needs_refresh(&self, credentials: &HashMap<String, String>) -> bool;
async fn refresh(
&self,
credentials: &HashMap<String, String>,
) -> Result<HashMap<String, String>, RefreshError>;
}2. Gateway refresh loop — a single background task iterates all provider records, calls needs_refresh() for each, and calls refresh() for those that do. Follows the existing SSH session reaper pattern. Runs on a configurable interval (default: 10 minutes), well within the 1-hour expiry window of OAuth2 tokens.
3. Vertex implementation — VertexRefresher stores a service account JSON key, calls Google's OAuth2 token endpoint, and returns a fresh Bearer token. The token expiry timestamp is stored alongside the credential so needs_refresh() can check it without making a network call.
4. Delivery via existing bundle poll — no sandbox-side changes. The sandbox polls GetInferenceBundle every 30 seconds. When a token is refreshed, the bundle revision hash changes, the sandbox detects it on the next poll, and atomically updates the Arc<RwLock<Vec<ResolvedRoute>>> route cache. No router restart, no dropped connections.
5. New vertex provider profile (crates/openshell-core/src/inference.rs):
static VERTEX_PROFILE: InferenceProviderProfile = InferenceProviderProfile {
provider_type: "vertex",
default_base_url: "https://{region}-aiplatform.googleapis.com/v1",
protocols: ANTHROPIC_PROTOCOLS,
credential_key_names: &["GOOGLE_SERVICE_ACCOUNT_JSON"],
base_url_config_keys: &["VERTEX_BASE_URL"],
auth: AuthHeader::Bearer,
default_headers: &[("anthropic-version", "2023-06-01")],
};The agent connects to inference.local as normal. The router strips the agent's auth header and injects the current access token. The service account JSON key never leaves the gateway.
User-facing interface — no new CLI commands:
openshell provider add --type vertex \
--name my-vertex \
--credential GOOGLE_SERVICE_ACCOUNT_JSON="$(cat sa.json)" \
--config VERTEX_BASE_URL=https://us-east1-aiplatform.googleapis.com/v1/...
openshell inference set --provider my-vertex --model claude-3-7-sonnetFuture providers (Azure Entra ID, IBM watsonx) implement the same trait without touching the refresh loop or delivery mechanism.
Alternatives Considered
- Hardcode Vertex-specific logic in the gateway: duplicated for every subsequent provider with short-lived credentials. Rejected in favour of the trait.
- Client-side refresh in provider plugin: plugins only run at discovery time on the user's machine, no server-side lifecycle.
- Sidecar token refresher: no existing infrastructure, adds a new component for one provider's auth quirk.
- Static token stored at provider creation: expires after 1 hour, fails silently mid-session.
ANTHROPIC_BASE_URLoverride: auth header mismatch (x-api-keyvs Bearer), broken on first request.- AWS Bedrock via this design: SigV4 is per-request signing, incompatible with a stored-token model. Separate issue.
Agent Investigation
crates/openshell-core/src/inference.rs: three hardcoded profiles (openai,anthropic,nvidia);profile_for()returnsNonefor all other types.crates/openshell-providers/src/providers/generic.rs: placeholder only,discover_existing()always returnsNone. No refresh capability.crates/openshell-sandbox/src/lib.rs(spawn_route_refresh): pollsGetInferenceBundleevery 30s, uses revision hash for change detection, atomically replaces route cache — no sandbox-side changes needed.- Gateway background task pattern established by SSH session reaper in
crates/openshell-server/. Refresh loop follows the same structure. - No existing Vertex, Google, Gemini, Azure, or watsonx references anywhere in the codebase.
- No open issues covering this gap.