feat: add agent catalog/auth API and safer orchestrator switching#276
feat: add agent catalog/auth API and safer orchestrator switching#276nikhilachale wants to merge 14 commits into
Conversation
- Implemented AgentsController to handle /agents endpoint, returning a list of supported and installed agents. - Created agent inventory service to manage agent data and detect installed agents. - Updated ProjectSettingsForm to fetch and display agent information, including installed and supported agents. - Enhanced error handling for agent detection and orchestrator restarts. - Added tests for agent catalog and service to ensure correct functionality and error handling.
…flect changes - Added `AuthStatus` method to various agent plugins to check authorization status using CLI probes. - Introduced `authprobe` package to handle common CLI command checks for agent authorization. - Updated backend tests to include scenarios for authorized and unauthorized agents. - Modified frontend API schema to include `authorized` counts and `authStatus` for agents. - Enhanced `ProjectSettingsForm` to display authorized agents and their statuses, including prompts for login when necessary. - Adjusted agent selection logic to prioritize authorized agents and provide feedback for unauthorized or uninstalled agents.
There was a problem hiding this comment.
Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.
There was a problem hiding this comment.
Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.
…references from API and frontend
| return nil | ||
| } | ||
| sort.Sort(sort.Reverse(sort.StringSlice(matches))) | ||
| return matches |
There was a problem hiding this comment.
`nvmNodeBinCandidates` is missing its closing brace — the function body never terminates before `func resolveNativeWindowsCodex(...)` starts on the next line, which nests a named function declaration inside another function. This is not valid Go and the package will not compile as currently pushed:
```go
func nvmNodeBinCandidates(home, binary string) []string {
matches, err := filepath.Glob(filepath.Join(home, ".nvm", "versions", "node", "*", "bin", binary))
if err != nil || len(matches) == 0 {
return nil
}
sort.Sort(sort.Reverse(sort.StringSlice(matches)))
return matches
func resolveNativeWindowsCodex(path string) string { // <-- missing } above
Needs a `}` after `return matches` to close `nvmNodeBinCandidates` before `resolveNativeWindowsCodex` starts. Given `TestResolveCodexBinaryFindsNVMInstallWhenPathIsSparse` was added in this same PR, this file couldn't have been built/tested in its current state — worth double-checking the push matches what was actually tested locally.
| if err != nil { | ||
| m.logger.Warn("session manager: old orchestrator probe failed after runtime destroy", | ||
| "session", id, "err", err) | ||
| } else if alive { |
There was a problem hiding this comment.
If IsAlive still reports true after recordRetiredTermination already succeeded, the loop retries Destroy/recordRetiredTermination again, and on final failure returns ErrSessionStillAlive — but the DB already says terminated. Since List(Active: true) won't surface a terminated row, nothing will ever retry killing this runtime again: a zombie orchestrator can leak silently while the system believes it's fully retired. Don't re-record termination once it has already succeeded; surface a distinct "terminated but still alive" signal so recovery tooling can retry the kill.
|
Please resolve merge conflicts and test the flow end to end locally as well. |
- Updated NewTaskDialog tests to increase timeout for async operations. - Modified ProjectSettingsForm tests to improve agent handling and validation messages. - Refactored ProjectSettingsForm component to streamline agent selection and validation logic. - Introduced new agent service to manage agent inventory and authentication status. - Improved Sidebar tests to ensure proper agent options are loaded and handled. - Enhanced SessionsBoard component by removing unused imports and optimizing state management. - Fixed Select component styling for better consistency in UI. - Added error handling for AO daemon readiness in ShellLayout.
|
other than that
|
…in session manager
There was a problem hiding this comment.
nit: can we keep the file name service_test according to other prs?
| // then retire any older active orchestrators for that project so a failed | ||
| // replacement never causes downtime. This business rule belongs here, not in | ||
| // the HTTP controller. | ||
| func (s *Service) SpawnOrchestrator(ctx context.Context, projectID domain.ProjectID, clean bool) (domain.Session, error) { |
There was a problem hiding this comment.
when the new orchestrator is spawning what does the user experience look like?



Summary
This PR adds a daemon-backed agent catalog, exposes installed/authorized agent state to the frontend, and uses that data in project settings so users can choose worker/orchestrator agents more safely.
It also adds orchestrator replacement handling: when the saved orchestrator agent changes, AO starts the replacement first and only retires the previous orchestrator after the new one is up, so a failed replacement does not cause downtime.
A key caveat is that agent-auth/login flows can interfere with replacement startup. If switching agents triggers the agent’s own bootstrap path, the replacement may come up outside AO’s normal orchestrator initialization path and miss the AO orchestrator system prompt.
What Changed
Backend
Added agent inventory service for:
Added optional AgentAuthChecker capability on adapters.
Added shared CLI auth probing helper for adapters with cheap local auth checks.
Added GET /api/v1/agents.
Extended registry inventory entries to carry adapter manifest metadata for user-facing labels.
Added orchestrator replacement flow in the session service:
Added backend tests for agent catalog, controller responses, session replacement behavior, and related project/service wiring.
Frontend
Regenerated API types for the new agents endpoint/DTOs.
Updated ProjectSettingsForm to:
Added/updated tests for the new settings behavior.
Why
Before this change, the UI did not have a daemon-backed view of which agents are actually installed and authenticated on the local machine, and changing orchestrator agent config did not have a clear replacement flow.
This PR makes agent selection more grounded in local runtime state and reduces the chance of downtime during orchestrator switches.
Risks / Caveats
A large part of the file count comes from:
The CLI/runtime/session model is unchanged outside the new inventory/auth and orchestrator replacement paths.
Generated files are included intentionally:
Auth/login flows remain a review risk. If switching agents triggers the agent’s own login/bootstrap flow, that flow can spawn a fresh native session outside AO’s normal orchestrator startup path.
In that case, the replacement session may miss AO’s expected initialization, including the orchestrator system prompt.
The old orchestrator is intentionally preserved on replacement failure, but reviewers should pay close attention to whether
replacement startup still guarantees AO system-prompt delivery.
Closes #275