Skip to content

Make runtime provider switching work on the FastAPI surface#27

Draft
nyimbi wants to merge 3 commits into
VRSEN:mainfrom
nyimbi:feat/runtime-switch-fastapi
Draft

Make runtime provider switching work on the FastAPI surface#27
nyimbi wants to merge 3 commits into
VRSEN:mainfrom
nyimbi:feat/runtime-switch-fastapi

Conversation

@nyimbi
Copy link
Copy Markdown

@nyimbi nyimbi commented May 9, 2026

Summary

Follow-up to #26 — that PR documented FastAPI as a no-op for runtime provider switching on the assumption that the agency was constructed once at startup. Reading agency-swarm's request handlers shows the agency is actually rebuilt per-request (agency_factory(...) is invoked inside each chat/run call at endpoint_handlers.py lines 457, 552, 825). All that's needed for FastAPI to pick up a switch is for os.environ to reflect the new .env values before the next request rebuilds the agency.

Stacked on #26

This branch is built on top of feat/azure-ollama-providers. Until #26 merges, the diff shown here includes both PRs' changes. The actual delta for this PR is 4 files / +98 / -63.

If #26 lands first, this PR's diff will collapse to just the runtime-switching fix.

What changed

orchestrator/tools/SwitchProvider.py: After the atomic .env rewrite, the tool now calls load_dotenv(override=True) so the running process's os.environ reflects the new values. The next request that rebuilds the agency reads them naturally.

TUI restart flag → best-effort. The switch is now live in-process the moment run() returns; the flag is just a UX cue for the TUI to refresh its display. A failed flag touch is non-fatal — the tool reports success because the switch did apply. The previous "Cannot switch — no restart signal available" error path is gone; running outside the TUI loop (i.e. the FastAPI process) is now a supported context.

Documentation:

  • server.py header: removed the "switches stay pinned until restart" warning; describes how per-request rebuilds pick up changes naturally.
  • orchestrator/instructions.md: removed the "exit the TUI to apply" instruction. Switches are live immediately. TUI users only /quit if they want a fresh display.
  • SwitchProvider docstring: explains why FastAPI works (per-request rebuild) and why the flag is now best-effort.

Tests (36 → 38)

  • test_no_flag_env_var_still_succeeds: verifies the FastAPI-style context (no OPENSWARM_SWITCH_FLAG env var) gets a successful switch + .env write + os.environ refresh.
  • test_oserror_on_flag_touch_does_not_abort_switch: a failing flag touch returns success because the env reload already applied.
  • test_switch_refreshes_os_environ_for_fastapi_path: the core guarantee — os.environ[\"DEFAULT_MODEL\"] reflects the switch after run() returns.
  • test_switch_refreshes_provider_credentials_in_environ: pre-existing .env credentials become visible in os.environ post-switch so the next agency build can authenticate.
  • Updated two prior tests ("no flag means refused" / "OSError aborts") to match the new behavior.
$ pytest
========================== 38 passed in 0.24s =============================

Test plan

  • All 38 unit tests pass on a bare Python install
  • Pending end-to-end verification: live FastAPI server + a real provider switch via the chat endpoint. Requires a running provider account; happy to coordinate with anyone who has one.
  • Manual trace through agency-swarm's endpoint_handlers.py to confirm agency_factory is per-request (verified at lines 457, 552, 825).

Backwards compatibility

No behavior change for the TUI surface — the restart flag is still touched on success and the loop in run_utils.main() works exactly as before. The TUI just stops being the only valid execution context for SwitchProvider. Existing tests for the TUI path continue to pass.

🤖 Generated with Claude Code

nyimbi and others added 3 commits May 9, 2026 14:59
…viders

Extends OpenSwarm's provider matrix from three (OpenAI / Anthropic / Google)
to seven, plus a runtime provider switch for the orchestrator.

New providers
- Azure OpenAI Service: your own gpt-* deployment on Azure (LiteLLM
  prefix `azure/`, env: AZURE_API_KEY + AZURE_API_BASE + AZURE_API_VERSION).
- Azure AI Foundry: catalog of non-OpenAI models on Azure including
  Anthropic Claude (Opus / Sonnet), Llama, Mistral, DeepSeek (LiteLLM
  prefix `azure_ai/`, env: AZURE_AI_API_KEY + AZURE_AI_API_BASE). For
  Anthropic models the base URL must end with `/anthropic`.
- Ollama (local): no key required, defaults to http://localhost:11434.
  OLLAMA_API_BASE is threaded explicitly into LitellmModel.
- OpenAI-compatible: generic route for any vendor with an OpenAI-shaped
  API — Ollama Cloud, Groq, Together AI, Mistral La Plateforme,
  OpenRouter, vLLM-based deployments. Uses dedicated OPENAI_COMPAT_*
  env vars so a real OPENAI_API_KEY kept for fallback is never
  overwritten. Only the base URL is required; key is optional for
  keyless local endpoints.

Provider routing
- Single source of truth: config.PROVIDER_REGISTRY maps slug to
  (prefix, required_env). Both the SwitchProvider tool and the
  onboarding wizard derive their behavior from this table.
- DEFAULT_MODEL=openai_compat/<model> is a sentinel that
  config._resolve() unwraps to LiteLLM's openai/<model> with the
  dedicated credentials passed via base_url and api_key.
- get_active_provider() classifies via longest-prefix-wins lookup
  (so azure_ai/ matches before azure/) and returns "unknown" for
  unrecognized litellm/<vendor>/<model> strings.

Runtime switching
- New SwitchProvider tool in orchestrator/tools/, registered only on
  the orchestrator. Users say "switch to ollama llama3.1" or
  "/switch-provider azure_ai claude-opus-4-1"; the tool validates
  credentials, writes DEFAULT_MODEL to .env atomically, and signals
  run_utils.main() to rebuild the agency on next TUI exit. The
  orchestrator's "router only" contract is preserved with a single
  documented carve-out for this administrative concern.
- The FastAPI server (server.py) doesn't read the restart signal —
  switching from an API client is a documented no-op.
- Restart flag files live in a user-scoped tempdir (mode 0o700) so
  a co-tenant on /tmp can't force a spurious restart.

Hardening
- SSRF defense: SwitchProvider refuses any openai_compat switch where
  OPENAI_COMPAT_API_BASE isn't an https:// URL with a real hostname.
  Closes the prompt-injection chain where an attacker pre-positions
  the base URL and induces a switch, redirecting all subsequent LLM
  traffic (with bearer tokens and conversation history).
- Input validation: model field requires alphanumeric start + the
  characters real model names use ([\w.:-/]). Blocks newline
  injection into .env, shell metacharacters, and `..`-style ids.
- Atomic .env write: the restart flag is touched BEFORE the .env
  rewrite so a crash in any window leaves recoverable state. The
  rewrite uses set_key on a temp copy then os.replace to avoid
  partial-read exposure.
- config._resolve() raises RuntimeError when openai_compat is
  configured without the base URL, instead of returning a
  LitellmModel with None credentials that would fail cryptically at
  first call.
- The except clause in _resolve catches only ImportError;
  TypeError now propagates so misconfigured kwargs surface
  immediately rather than degrading to a bare model string.

Tests
- 36 pytest cases cover provider validation, SSRF guard,
  input validation, atomic write recovery, missing-credential
  errors, prefix classification (incl. longest-prefix-wins for
  azure_ai/ vs azure/), openai_compat unwrap to openai/<model>,
  RuntimeError on missing API_BASE, ImportError graceful degradation,
  TypeError propagation, dotenv quoting round-trips, OSError on flag
  touch refuses switch, and the wizard's PROVIDERS data shape contract.
- Test scaffolding stubs agency_swarm + openai.types.shared in
  sys.modules so the suite runs from a bare Python install with just
  pytest + python-dotenv + pydantic — no need for the full
  agency-swarm dependency chain.

Documentation
- README updated: 7-provider list, runtime switch description,
  upgrading-from-earlier-version section.
- AGENTS.md documents the orchestrator/tools/ convention and the
  PROVIDER_REGISTRY contract.
- orchestrator/instructions.md documents the administrative carve-out.
- .env.example documents every new env var with vendor URL examples
  for openai_compat (Groq, Together, Mistral, OpenRouter; Ollama
  Cloud points at https://docs.ollama.com since the canonical
  endpoint can change).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The previous PR (be205e9) documented FastAPI as a no-op for runtime
switching, on the assumption that the agency was constructed once at
startup. Reading agency-swarm's request handlers shows the agency is
actually rebuilt per-request — `agency_factory(load_threads_callback=...)`
is invoked inside each chat/run call (see endpoint_handlers.py:457, :552,
:825). All that's needed for FastAPI to pick up a switch is for
os.environ to reflect the new .env values before the next request.

Three small changes:

- SwitchProvider.run() now calls load_dotenv(override=True) on the
  freshly written .env after the atomic rewrite. This refreshes the
  running process's os.environ so the next agency build (whether
  driven by a FastAPI request or a TUI restart) sees the new
  DEFAULT_MODEL and credentials.
- The TUI restart flag becomes best-effort. The switch is already live
  in-process via env reload; the flag is now just a UX cue for the TUI
  to refresh its display state. A failed flag touch is non-fatal — we
  return success since the switch did apply.
- The previous "Cannot switch — no restart signal available" path is
  gone. Running outside the TUI loop is now a supported context, not
  an error.

Updated docs:
- server.py header: removed the "switching is a no-op" warning;
  describes how per-request rebuilds pick up the change.
- orchestrator/instructions.md: removed the "exit the TUI to apply"
  instruction. Switches are live immediately; TUI users only quit if
  they want a fresh display.
- SwitchProvider docstring: explains why FastAPI works (per-request
  rebuild) and why the flag is now best-effort.

Tests (36 → 38):
- test_no_flag_env_var_still_succeeds: verifies the FastAPI-style
  context (no flag env var) gets a successful switch + .env write +
  os.environ refresh.
- test_oserror_on_flag_touch_does_not_abort_switch: a failing flag
  touch returns success because the env reload already applied.
- test_switch_refreshes_os_environ_for_fastapi_path: the core
  guarantee — os.environ["DEFAULT_MODEL"] reflects the switch after
  run() returns.
- test_switch_refreshes_provider_credentials_in_environ: pre-existing
  .env credentials become visible in os.environ post-switch, so the
  next agency build can authenticate.
- The two tests asserting "no flag means refused" / "OSError aborts"
  were updated to match the new behavior.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Four pytest cases that hit real provider endpoints when credentials are
in the environment, skipped cleanly otherwise. Marked `live` so the
default `pytest` invocation includes them but a CI / stub-only run can
exclude with `pytest -m "not live"`.

Tests
- test_live_ollama_chat: real chat against a local Ollama server.
  Discovers an available model via /api/tags, skips if Ollama is
  unreachable or no models are pulled.
- test_live_azure_ai_foundry_claude: real chat against Azure-hosted
  Claude. Validates the /anthropic URL suffix path documented in
  PROVIDER_REGISTRY by sending a real prompt to a real Azure endpoint.
- test_live_azure_openai: real chat against an Azure OpenAI Service
  deployment. Skipped unless AZURE_OPENAI_DEPLOYMENT is set (since
  there's no canonical default for "your deployment").
- test_live_switch_provider_transition: starts on local Ollama,
  invokes SwitchProvider to change to Azure AI Foundry, verifies the
  TUI flag was touched, os.environ was refreshed in-process, and
  the very next live call reaches Claude on Azure with no restart.
  This is the end-to-end proof of the FastAPI runtime-switching
  guarantee from PR VRSEN#27.

Credentials
- Read only from the shell environment; never written to source.
- Bridges OpenAI-SDK-style names (AZURE_OPENAI_API_KEY,
  ANTHROPIC_FOUNDRY_RESOURCE, OLLAMA_HOST) to OpenSwarm's LiteLLM-
  style names (AZURE_API_KEY, AZURE_AI_API_BASE, OLLAMA_API_BASE)
  inside the fixture so users with either convention can run.
- Each test uses pytest.skip with a clear reason when its credentials
  are absent, so missing keys never become test failures.

Run
  pytest                    # all tests, live ones skip if no creds
  pytest -m live -v         # only live tests
  pytest -m "not live"      # only stub-friendly unit tests (CI)

Result on the author's machine (Azure AI Foundry + local Ollama):
  41 passed, 1 skipped (Azure OpenAI Service — no deployment in env)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@nyimbi
Copy link
Copy Markdown
Author

nyimbi commented May 9, 2026

Live verification — runtime switching works on a real provider transition

Ran tests/test_live_providers.py::test_live_switch_provider_transition against a local Ollama server + Azure AI Foundry. The test:

  1. Configures the process to talk to local Ollama
  2. Calls SwitchProvider(provider="azure_ai", model="claude-sonnet-4-6")
  3. Asserts:
    • .env was rewritten atomically
    • The TUI restart flag was touched
    • os.environ["DEFAULT_MODEL"] reflects the new value in-process (the FastAPI guarantee)
  4. Re-imports config to simulate a per-request agency rebuild
  5. Sends a real prompt — and gets a real response from Claude on Azure

All in one Python process, no restart, no supervisor, no file watcher. That's the end-to-end proof for this PR's claim about FastAPI compatibility.

Full live suite results

tests/test_live_providers.py::test_live_ollama_chat                     PASSED
tests/test_live_providers.py::test_live_azure_ai_foundry_claude         PASSED
tests/test_live_providers.py::test_live_azure_openai                    SKIPPED (no AZURE_OPENAI_DEPLOYMENT)
tests/test_live_providers.py::test_live_switch_provider_transition      PASSED

3 passed, 1 skipped in 46.85s

The Azure OpenAI Service direct test (azure/<deployment>) is collected and ready — it just needs AZURE_OPENAI_DEPLOYMENT set to skip-then-run cleanly. Identical routing code path to azure_ai/, only the URL endpoint differs.

Combined unit + live suite

$ pytest
41 passed, 1 skipped in 24.66s

pytest -m "not live" excludes live tests for CI.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant