feat: credential refresh trait for short-lived provider tokens (Vertex AI first implementation)

## Problem Statement

Several inference providers use short-lived credentials that require periodic refresh: Google Vertex AI and Azure OpenAI with Entra ID use OAuth2 tokens (1-hour expiry), IBM watsonx uses IBM Cloud IAM tokens. OpenShell has no mechanism for credential lifecycle management — provider credentials are stored once at creation and never updated. Attempting to use these providers today fails silently after token expiry, with no workaround that doesn't require manual intervention.

This is not a Vertex-specific gap. Any provider that issues short-lived credentials hits the same missing primitive.

## Proposed Design

Introduce a lightweight `CredentialRefresher` trait in the gateway, with Google Vertex AI as the first implementation. AWS Bedrock is explicitly out of scope — SigV4 is per-request signing, not a stored token, and requires a different approach.

**1. `CredentialRefresher` trait** (`crates/openshell-server/src/providers/refresh.rs`):

```rust
#[async_trait]
pub trait CredentialRefresher: Send + Sync {
    fn provider_type(&self) -> &'static str;
    fn needs_refresh(&self, credentials: &HashMap<String, String>) -> bool;
    async fn refresh(
        &self,
        credentials: &HashMap<String, String>,
    ) -> Result<HashMap<String, String>, RefreshError>;
}
```

**2. Gateway refresh loop** — a single background task iterates all provider records, calls `needs_refresh()` for each, and calls `refresh()` for those that do. Follows the existing SSH session reaper pattern. Runs on a configurable interval (default: 10 minutes), well within the 1-hour expiry window of OAuth2 tokens.

**3. Vertex implementation** — `VertexRefresher` stores a service account JSON key, calls Google's OAuth2 token endpoint, and returns a fresh Bearer token. The token expiry timestamp is stored alongside the credential so `needs_refresh()` can check it without making a network call.

**4. Delivery via existing bundle poll** — no sandbox-side changes. The sandbox polls `GetInferenceBundle` every 30 seconds. When a token is refreshed, the bundle revision hash changes, the sandbox detects it on the next poll, and atomically updates the `Arc<RwLock<Vec<ResolvedRoute>>>` route cache. No router restart, no dropped connections.

**5. New `vertex` provider profile** (`crates/openshell-core/src/inference.rs`):

```rust
static VERTEX_PROFILE: InferenceProviderProfile = InferenceProviderProfile {
    provider_type: "vertex",
    default_base_url: "https://{region}-aiplatform.googleapis.com/v1",
    protocols: ANTHROPIC_PROTOCOLS,
    credential_key_names: &["GOOGLE_SERVICE_ACCOUNT_JSON"],
    base_url_config_keys: &["VERTEX_BASE_URL"],
    auth: AuthHeader::Bearer,
    default_headers: &[("anthropic-version", "2023-06-01")],
};
```

The agent connects to `inference.local` as normal. The router strips the agent's auth header and injects the current access token. The service account JSON key never leaves the gateway.

**User-facing interface** — no new CLI commands:

```bash
openshell provider add --type vertex \
  --name my-vertex \
  --credential GOOGLE_SERVICE_ACCOUNT_JSON="$(cat sa.json)" \
  --config VERTEX_BASE_URL=https://us-east1-aiplatform.googleapis.com/v1/...
openshell inference set --provider my-vertex --model claude-3-7-sonnet
```

Future providers (Azure Entra ID, IBM watsonx) implement the same trait without touching the refresh loop or delivery mechanism.

## Alternatives Considered

- **Hardcode Vertex-specific logic in the gateway**: duplicated for every subsequent provider with short-lived credentials. Rejected in favour of the trait.
- **Client-side refresh in provider plugin**: plugins only run at discovery time on the user's machine, no server-side lifecycle.
- **Sidecar token refresher**: no existing infrastructure, adds a new component for one provider's auth quirk.
- **Static token stored at provider creation**: expires after 1 hour, fails silently mid-session.
- **`ANTHROPIC_BASE_URL` override**: auth header mismatch (`x-api-key` vs Bearer), broken on first request.
- **AWS Bedrock via this design**: SigV4 is per-request signing, incompatible with a stored-token model. Separate issue.

## Agent Investigation

- `crates/openshell-core/src/inference.rs`: three hardcoded profiles (`openai`, `anthropic`, `nvidia`); `profile_for()` returns `None` for all other types.
- `crates/openshell-providers/src/providers/generic.rs`: placeholder only, `discover_existing()` always returns `None`. No refresh capability.
- `crates/openshell-sandbox/src/lib.rs` (`spawn_route_refresh`): polls `GetInferenceBundle` every 30s, uses revision hash for change detection, atomically replaces route cache — no sandbox-side changes needed.
- Gateway background task pattern established by SSH session reaper in `crates/openshell-server/`. Refresh loop follows the same structure.
- No existing Vertex, Google, Gemini, Azure, or watsonx references anywhere in the codebase.
- No open issues covering this gap.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: credential refresh trait for short-lived provider tokens (Vertex AI first implementation) #472

Problem Statement

Proposed Design

Alternatives Considered

Agent Investigation

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

feat: credential refresh trait for short-lived provider tokens (Vertex AI first implementation) #472

Description

Problem Statement

Proposed Design

Alternatives Considered

Agent Investigation

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions