Add Session API Design by markturansky · Pull Request #1 · markturansky/agent-boss

markturansky · 2026-03-06T14:26:35Z

Summary

Proposes a unified Session API that abstracts over different session providers (tmux, Paude, Ambient) while maintaining existing Agent Boss coordination capabilities.

Key Features

🎯 SessionProvider Interface: Standardized create/start/stop/delete/message operations
📋 Unified Session Model: Provider-agnostic status, configuration, and metadata
🔗 REST API: Complete HTTP endpoints that integrate with existing architecture
🔄 Backward Compatibility: Existing agent coordination unchanged
⚙️ Multi-Provider Support: tmux (local), Paude (containers), Ambient (cloud)
📝 Declarative Configuration: Space-specific overrides and agent templates

Implementation Plan

Phase 1: Core abstraction + TmuxProvider
Phase 2: PaudeProvider integration
Phase 3: Enhanced management + monitoring
Phase 4: AmbientProvider + cloud sessions

Benefits

Unified Management: Same API across local/container/cloud
Provider Choice: Optimal session type per use case
Clean Architecture: Session management separated from coordination
Future-Proof: Easy to add new providers (K8s, Lambda, etc.)

This design enables Agent Boss to scale from local development to production cloud deployment while keeping the coordination model that makes it effective.

🤖 Generated with Claude Code

Proposes unified session abstraction over tmux, Paude, and Ambient providers. Key features: - SessionProvider interface for create/start/stop/delete/message operations - REST API endpoints for session management - Provider-agnostic Session model with unified status/config - Backward compatibility with existing agent coordination - Configuration system with space-specific overrides - 4-phase implementation plan Enables consistent session management across local, container, and cloud environments while maintaining Agent Boss coordination capabilities. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>

markturansky · 2026-04-04T03:12:01Z

Thanks for this PR! 🤖 Comment left by github-agent.

markturansky · 2026-04-04T03:12:08Z

Another comment from github-agent! 🚀

markturansky · 2026-04-04T03:16:14Z

Synchronous comment posted! Testing notification flow. 🔔

markturansky

Thank you for this thoughtful and comprehensive design document! The Session API proposal is well-structured and addresses a clear need for provider abstraction. I've reviewed the design thoroughly and have some feedback that I hope will be helpful.

Strengths

Excellent architecture decisions:

The provider abstraction is clean and well-designed, with clear separation of concerns
Backward compatibility strategy is well thought out
The phased implementation plan is realistic and incremental
Great use of examples and code snippets throughout

Comprehensive coverage:

All three provider types (tmux, Paude, Ambient) are well documented
REST API design follows good conventions
Security considerations are addressed

Design Feedback

1. Provider Interface & Error Handling

The SessionProvider interface looks solid, but consider adding explicit error types for better handling:

type SessionError struct {
    Code    ErrorCode
    Message string
    Cause   error
}

type ErrorCode string
const (
    ErrSessionNotFound    ErrorCode = "session_not_found"
    ErrSessionAlreadyExists ErrorCode = "session_already_exists"
    ErrProviderUnavailable ErrorCode = "provider_unavailable"
    ErrInvalidConfiguration ErrorCode = "invalid_configuration"
)

This would help API clients handle different failure modes appropriately and make provider implementations more consistent.

2. Session Lifecycle State Transitions

The SessionStatus enum is good, but the document doesn't specify valid state transitions. Consider documenting:

Can a stopped session be restarted, or must it be deleted and recreated?
What happens if a delete is called on a running session—does it auto-stop first?
How do you handle a session that crashes (transitions from running to error)?

A state transition diagram would be valuable here.

3. API Design Considerations

Endpoint consistency: You have both /api/v1/sessions/{id} and /spaces/{space}/sessions/{name} endpoints. Consider:

Should the space-scoped endpoints be under /api/v1/spaces/{space}/sessions for consistency?
The dual addressing (by id vs by space/name) is useful but could be confusing. Maybe clarify when to use which.

Missing endpoints:

Batch operations (start/stop multiple sessions)
Session template CRUD endpoints (based on your config examples)
Provider health/capability discovery (GET /api/v1/providers)

4. Messaging Model

The SendMessage operation is interesting but underdeveloped:

What's the message format? Just plain strings, or structured data?
How do responses work? Is messaging one-way or request/response?
Should there be a message queue or pub/sub model for reliability?
How do you handle message delivery failures across different providers?

Consider whether the coordination layer (existing broadcast functionality) should be separate from the provider-specific messaging.

5. Resource Limits & Validation

The ResourceLimits are provider-specific but presented as universal. Consider:

How does tmux handle CPU/memory limits? (It can't natively)
Should there be provider capability negotiation?
Validation: what happens if you request 100 CPUs on a machine with 4?

Maybe add a ValidateConfig(config SessionConfig) error method to the provider interface.

6. Configuration Complexity

The configuration model is very flexible but potentially complex:

Multiple layers of defaults (provider → space → agent)
How are conflicts resolved? (e.g., space says tmux, agent template says paude)
Consider adding examples of the final resolved configuration

The configuration merging/precedence rules should be explicitly documented.

7. Testing Strategy

The document doesn't address testing. Consider adding a section on:

Mock provider implementation for testing coordination logic
Integration tests for each provider
Contract tests to ensure all providers implement the interface correctly
How to test provider-specific features vs common features

8. Observability & Monitoring

Great start with GetLogs and IsHealthy, but consider:

Metrics: session count by status/provider, session duration, failure rates
Tracing: distributed tracing across session creation → message delivery
Events: session lifecycle events for audit logs
Should there be a webhook/notification system for session status changes?

9. Provider Implementation Details

Tmux:

How do you prevent session name collisions?
What happens if a tmux session exists but Agent Boss doesn't know about it?
Session persistence across Agent Boss restarts—how is state recovered?

Paude:

Container cleanup on failures?
How do you handle container image pulls (slow/failing)?
Volume mount conflicts?

Ambient:

API rate limiting and retry logic?
Credential rotation?
How to handle API version changes?

10. Migration & Rollback

The migration strategy is mentioned but light on details:

How do you migrate existing tmux sessions to the new API without disruption?
What's the rollback plan if Phase 2 or 3 has issues?
Can old and new systems run simultaneously during migration?

Minor Issues

Line 52: SessionConfig.Image - should this be ImageURI to be more specific?
Line 96: SupportedFeatures() []string - consider returning a structured feature set rather than strings
Line 221: The tmux Start comment says "Session starts on creation" but the interface has separate Create and Start methods—clarify this impedance mismatch
Line 386: Agent templates shown but not explained how they're stored/loaded

Questions

How do you handle provider-specific features that don't fit the common interface (e.g., tmux window splitting, Ambient's advanced networking)?
Is there a plan for session affinity or scheduling (e.g., "run this agent on the same host as that one")?
How does this interact with the existing agent registration system? Do agents need to be aware of sessions?
What's the expected session lifetime? Minutes? Hours? Days? This affects cleanup strategies.

Recommendations

Add a testing section to the implementation plan
Document state transitions and lifecycle more explicitly
Consider adding a provider registry/discovery mechanism
Add error handling and retry strategy documentation
Include failure mode analysis (what happens when a provider becomes unavailable?)
Add metrics/observability to the design

Summary

This is a solid foundation for a session abstraction layer. The design is well-structured and the problem is clearly understood. The main areas for enhancement are:

More detailed error handling and state management
Explicit testing strategy
Provider capability negotiation
Configuration precedence rules
Observability and operational concerns

I'm excited to see this implemented! Please let me know if you'd like me to elaborate on any of these points or if you have questions about the feedback.

markturansky commented Apr 4, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Session API Design#1

Add Session API Design#1
markturansky wants to merge 1 commit intomainfrom
feat/session-api-design

markturansky commented Mar 6, 2026

Uh oh!

markturansky commented Apr 4, 2026

Uh oh!

markturansky commented Apr 4, 2026

Uh oh!

markturansky commented Apr 4, 2026

Uh oh!

markturansky left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

markturansky commented Mar 6, 2026

Summary

Key Features

Implementation Plan

Benefits

Uh oh!

markturansky commented Apr 4, 2026

Uh oh!

markturansky commented Apr 4, 2026

Uh oh!

markturansky commented Apr 4, 2026

Uh oh!

markturansky left a comment

Choose a reason for hiding this comment

Strengths

Design Feedback

1. Provider Interface & Error Handling

2. Session Lifecycle State Transitions

3. API Design Considerations

4. Messaging Model

5. Resource Limits & Validation

6. Configuration Complexity

7. Testing Strategy

8. Observability & Monitoring

9. Provider Implementation Details

10. Migration & Rollback

Minor Issues

Questions

Recommendations

Summary

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant