Make runtime provider switching work on the FastAPI surface by nyimbi · Pull Request #27 · VRSEN/OpenSwarm

nyimbi · 2026-05-09T12:25:13Z

Summary

Follow-up to #26 — that PR documented FastAPI as a no-op for runtime provider switching on the assumption that the agency was constructed once at startup. Reading agency-swarm's request handlers shows the agency is actually rebuilt per-request (agency_factory(...) is invoked inside each chat/run call at endpoint_handlers.py lines 457, 552, 825). All that's needed for FastAPI to pick up a switch is for os.environ to reflect the new .env values before the next request rebuilds the agency.

Stacked on #26

This branch is built on top of feat/azure-ollama-providers. Until #26 merges, the diff shown here includes both PRs' changes. The actual delta for this PR is 4 files / +98 / -63.

If #26 lands first, this PR's diff will collapse to just the runtime-switching fix.

What changed

orchestrator/tools/SwitchProvider.py: After the atomic .env rewrite, the tool now calls load_dotenv(override=True) so the running process's os.environ reflects the new values. The next request that rebuilds the agency reads them naturally.

TUI restart flag → best-effort. The switch is now live in-process the moment run() returns; the flag is just a UX cue for the TUI to refresh its display. A failed flag touch is non-fatal — the tool reports success because the switch did apply. The previous "Cannot switch — no restart signal available" error path is gone; running outside the TUI loop (i.e. the FastAPI process) is now a supported context.

Documentation:

server.py header: removed the "switches stay pinned until restart" warning; describes how per-request rebuilds pick up changes naturally.
orchestrator/instructions.md: removed the "exit the TUI to apply" instruction. Switches are live immediately. TUI users only /quit if they want a fresh display.
SwitchProvider docstring: explains why FastAPI works (per-request rebuild) and why the flag is now best-effort.

Tests (36 → 38)

test_no_flag_env_var_still_succeeds: verifies the FastAPI-style context (no OPENSWARM_SWITCH_FLAG env var) gets a successful switch + .env write + os.environ refresh.
test_oserror_on_flag_touch_does_not_abort_switch: a failing flag touch returns success because the env reload already applied.
test_switch_refreshes_os_environ_for_fastapi_path: the core guarantee — os.environ[\"DEFAULT_MODEL\"] reflects the switch after run() returns.
test_switch_refreshes_provider_credentials_in_environ: pre-existing .env credentials become visible in os.environ post-switch so the next agency build can authenticate.
Updated two prior tests ("no flag means refused" / "OSError aborts") to match the new behavior.

$ pytest
========================== 38 passed in 0.24s =============================

Test plan

All 38 unit tests pass on a bare Python install
Pending end-to-end verification: live FastAPI server + a real provider switch via the chat endpoint. Requires a running provider account; happy to coordinate with anyone who has one.
Manual trace through agency-swarm's endpoint_handlers.py to confirm agency_factory is per-request (verified at lines 457, 552, 825).

Backwards compatibility

No behavior change for the TUI surface — the restart flag is still touched on success and the loop in run_utils.main() works exactly as before. The TUI just stops being the only valid execution context for SwitchProvider. Existing tests for the TUI path continue to pass.

🤖 Generated with Claude Code

…viders Extends OpenSwarm's provider matrix from three (OpenAI / Anthropic / Google) to seven, plus a runtime provider switch for the orchestrator. New providers - Azure OpenAI Service: your own gpt-* deployment on Azure (LiteLLM prefix `azure/`, env: AZURE_API_KEY + AZURE_API_BASE + AZURE_API_VERSION). - Azure AI Foundry: catalog of non-OpenAI models on Azure including Anthropic Claude (Opus / Sonnet), Llama, Mistral, DeepSeek (LiteLLM prefix `azure_ai/`, env: AZURE_AI_API_KEY + AZURE_AI_API_BASE). For Anthropic models the base URL must end with `/anthropic`. - Ollama (local): no key required, defaults to http://localhost:11434. OLLAMA_API_BASE is threaded explicitly into LitellmModel. - OpenAI-compatible: generic route for any vendor with an OpenAI-shaped API — Ollama Cloud, Groq, Together AI, Mistral La Plateforme, OpenRouter, vLLM-based deployments. Uses dedicated OPENAI_COMPAT_* env vars so a real OPENAI_API_KEY kept for fallback is never overwritten. Only the base URL is required; key is optional for keyless local endpoints. Provider routing - Single source of truth: config.PROVIDER_REGISTRY maps slug to (prefix, required_env). Both the SwitchProvider tool and the onboarding wizard derive their behavior from this table. - DEFAULT_MODEL=openai_compat/<model> is a sentinel that config._resolve() unwraps to LiteLLM's openai/<model> with the dedicated credentials passed via base_url and api_key. - get_active_provider() classifies via longest-prefix-wins lookup (so azure_ai/ matches before azure/) and returns "unknown" for unrecognized litellm/<vendor>/<model> strings. Runtime switching - New SwitchProvider tool in orchestrator/tools/, registered only on the orchestrator. Users say "switch to ollama llama3.1" or "/switch-provider azure_ai claude-opus-4-1"; the tool validates credentials, writes DEFAULT_MODEL to .env atomically, and signals run_utils.main() to rebuild the agency on next TUI exit. The orchestrator's "router only" contract is preserved with a single documented carve-out for this administrative concern. - The FastAPI server (server.py) doesn't read the restart signal — switching from an API client is a documented no-op. - Restart flag files live in a user-scoped tempdir (mode 0o700) so a co-tenant on /tmp can't force a spurious restart. Hardening - SSRF defense: SwitchProvider refuses any openai_compat switch where OPENAI_COMPAT_API_BASE isn't an https:// URL with a real hostname. Closes the prompt-injection chain where an attacker pre-positions the base URL and induces a switch, redirecting all subsequent LLM traffic (with bearer tokens and conversation history). - Input validation: model field requires alphanumeric start + the characters real model names use ([\w.:-/]). Blocks newline injection into .env, shell metacharacters, and `..`-style ids. - Atomic .env write: the restart flag is touched BEFORE the .env rewrite so a crash in any window leaves recoverable state. The rewrite uses set_key on a temp copy then os.replace to avoid partial-read exposure. - config._resolve() raises RuntimeError when openai_compat is configured without the base URL, instead of returning a LitellmModel with None credentials that would fail cryptically at first call. - The except clause in _resolve catches only ImportError; TypeError now propagates so misconfigured kwargs surface immediately rather than degrading to a bare model string. Tests - 36 pytest cases cover provider validation, SSRF guard, input validation, atomic write recovery, missing-credential errors, prefix classification (incl. longest-prefix-wins for azure_ai/ vs azure/), openai_compat unwrap to openai/<model>, RuntimeError on missing API_BASE, ImportError graceful degradation, TypeError propagation, dotenv quoting round-trips, OSError on flag touch refuses switch, and the wizard's PROVIDERS data shape contract. - Test scaffolding stubs agency_swarm + openai.types.shared in sys.modules so the suite runs from a bare Python install with just pytest + python-dotenv + pydantic — no need for the full agency-swarm dependency chain. Documentation - README updated: 7-provider list, runtime switch description, upgrading-from-earlier-version section. - AGENTS.md documents the orchestrator/tools/ convention and the PROVIDER_REGISTRY contract. - orchestrator/instructions.md documents the administrative carve-out. - .env.example documents every new env var with vendor URL examples for openai_compat (Groq, Together, Mistral, OpenRouter; Ollama Cloud points at https://docs.ollama.com since the canonical endpoint can change). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

The previous PR (be205e9) documented FastAPI as a no-op for runtime switching, on the assumption that the agency was constructed once at startup. Reading agency-swarm's request handlers shows the agency is actually rebuilt per-request — `agency_factory(load_threads_callback=...)` is invoked inside each chat/run call (see endpoint_handlers.py:457, :552, :825). All that's needed for FastAPI to pick up a switch is for os.environ to reflect the new .env values before the next request. Three small changes: - SwitchProvider.run() now calls load_dotenv(override=True) on the freshly written .env after the atomic rewrite. This refreshes the running process's os.environ so the next agency build (whether driven by a FastAPI request or a TUI restart) sees the new DEFAULT_MODEL and credentials. - The TUI restart flag becomes best-effort. The switch is already live in-process via env reload; the flag is now just a UX cue for the TUI to refresh its display state. A failed flag touch is non-fatal — we return success since the switch did apply. - The previous "Cannot switch — no restart signal available" path is gone. Running outside the TUI loop is now a supported context, not an error. Updated docs: - server.py header: removed the "switching is a no-op" warning; describes how per-request rebuilds pick up the change. - orchestrator/instructions.md: removed the "exit the TUI to apply" instruction. Switches are live immediately; TUI users only quit if they want a fresh display. - SwitchProvider docstring: explains why FastAPI works (per-request rebuild) and why the flag is now best-effort. Tests (36 → 38): - test_no_flag_env_var_still_succeeds: verifies the FastAPI-style context (no flag env var) gets a successful switch + .env write + os.environ refresh. - test_oserror_on_flag_touch_does_not_abort_switch: a failing flag touch returns success because the env reload already applied. - test_switch_refreshes_os_environ_for_fastapi_path: the core guarantee — os.environ["DEFAULT_MODEL"] reflects the switch after run() returns. - test_switch_refreshes_provider_credentials_in_environ: pre-existing .env credentials become visible in os.environ post-switch, so the next agency build can authenticate. - The two tests asserting "no flag means refused" / "OSError aborts" were updated to match the new behavior. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Four pytest cases that hit real provider endpoints when credentials are in the environment, skipped cleanly otherwise. Marked `live` so the default `pytest` invocation includes them but a CI / stub-only run can exclude with `pytest -m "not live"`. Tests - test_live_ollama_chat: real chat against a local Ollama server. Discovers an available model via /api/tags, skips if Ollama is unreachable or no models are pulled. - test_live_azure_ai_foundry_claude: real chat against Azure-hosted Claude. Validates the /anthropic URL suffix path documented in PROVIDER_REGISTRY by sending a real prompt to a real Azure endpoint. - test_live_azure_openai: real chat against an Azure OpenAI Service deployment. Skipped unless AZURE_OPENAI_DEPLOYMENT is set (since there's no canonical default for "your deployment"). - test_live_switch_provider_transition: starts on local Ollama, invokes SwitchProvider to change to Azure AI Foundry, verifies the TUI flag was touched, os.environ was refreshed in-process, and the very next live call reaches Claude on Azure with no restart. This is the end-to-end proof of the FastAPI runtime-switching guarantee from PR VRSEN#27. Credentials - Read only from the shell environment; never written to source. - Bridges OpenAI-SDK-style names (AZURE_OPENAI_API_KEY, ANTHROPIC_FOUNDRY_RESOURCE, OLLAMA_HOST) to OpenSwarm's LiteLLM- style names (AZURE_API_KEY, AZURE_AI_API_BASE, OLLAMA_API_BASE) inside the fixture so users with either convention can run. - Each test uses pytest.skip with a clear reason when its credentials are absent, so missing keys never become test failures. Run pytest # all tests, live ones skip if no creds pytest -m live -v # only live tests pytest -m "not live" # only stub-friendly unit tests (CI) Result on the author's machine (Azure AI Foundry + local Ollama): 41 passed, 1 skipped (Azure OpenAI Service — no deployment in env) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

nyimbi · 2026-05-09T13:23:15Z

Live verification — runtime switching works on a real provider transition

Ran tests/test_live_providers.py::test_live_switch_provider_transition against a local Ollama server + Azure AI Foundry. The test:

Configures the process to talk to local Ollama
Calls SwitchProvider(provider="azure_ai", model="claude-sonnet-4-6")
Asserts:
- .env was rewritten atomically
- The TUI restart flag was touched
- os.environ["DEFAULT_MODEL"] reflects the new value in-process (the FastAPI guarantee)
Re-imports config to simulate a per-request agency rebuild
Sends a real prompt — and gets a real response from Claude on Azure

All in one Python process, no restart, no supervisor, no file watcher. That's the end-to-end proof for this PR's claim about FastAPI compatibility.

Full live suite results

tests/test_live_providers.py::test_live_ollama_chat                     PASSED
tests/test_live_providers.py::test_live_azure_ai_foundry_claude         PASSED
tests/test_live_providers.py::test_live_azure_openai                    SKIPPED (no AZURE_OPENAI_DEPLOYMENT)
tests/test_live_providers.py::test_live_switch_provider_transition      PASSED

3 passed, 1 skipped in 46.85s

The Azure OpenAI Service direct test (azure/<deployment>) is collected and ready — it just needs AZURE_OPENAI_DEPLOYMENT set to skip-then-run cleanly. Identical routing code path to azure_ai/, only the URL endpoint differs.

Combined unit + live suite

$ pytest
41 passed, 1 skipped in 24.66s

pytest -m "not live" excludes live tests for CI.

nyimbi and others added 3 commits May 9, 2026 14:59

nyimbi mentioned this pull request May 9, 2026

Add Azure OpenAI, Azure AI Foundry, Ollama, and OpenAI-compatible providers #26

Draft

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make runtime provider switching work on the FastAPI surface#27

Make runtime provider switching work on the FastAPI surface#27
nyimbi wants to merge 3 commits into
VRSEN:mainfrom
nyimbi:feat/runtime-switch-fastapi

nyimbi commented May 9, 2026

Uh oh!

nyimbi commented May 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

nyimbi commented May 9, 2026

Summary

Stacked on #26

What changed

Tests (36 → 38)

Test plan

Backwards compatibility

Uh oh!

nyimbi commented May 9, 2026

Live verification — runtime switching works on a real provider transition

Full live suite results

Combined unit + live suite

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant