feat: add agent lifecycle management - provisioning to decommission#923
Conversation
… registry-authoritative trust MSRC [112466]: PluginInstaller._resolve_dependencies() hardcoded verify=False, allowing unsigned malicious dependencies to bypass Ed25519 signature verification even when the caller explicitly passed verify=True. Fix propagates the verify flag through the entire dependency chain and hardens install() to fail closed (reject unsigned plugins and untrusted authors when verify=True). S360 WI 1140377: TrustHandshake._verify_response() used self-reported trust scores and capabilities from the handshake response instead of the registry's authoritative values. A registered agent could inflate its trust score or claim capabilities it does not possess. Fix uses registry-authoritative trust_score and capabilities for all threshold checks and result construction. Also adds DID binding enforcement to prevent response DID substitution attacks. Affected files: - packages/agent-marketplace/src/agent_marketplace/installer.py - packages/agent-mesh/src/agentmesh/marketplace/_marketplace_impl.py - packages/agent-mesh/src/agentmesh/trust/handshake.py Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…add contribution quality gate _marketplace_impl.py: Revert fail-closed verification to original behavior (verify only when trusted_keys AND signature are present). The MSRC [112466] bug only existed in agent-marketplace/installer.py; the _marketplace_impl.py already correctly passes verify=True in _resolve_dependencies. The fail-closed hardening broke 5 existing tests that install unsigned plugins with no trusted_keys configured. copilot-instructions.md: Add External Contribution Quality Gate section to filter low-quality integration proposals from unknown/low-traction projects. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…icrosoft#921) New package addressing the #1 market gap: shadow AI agent discovery. - 3 scanners (process, config, GitHub), inventory with dedup, reconciler, risk scoring, CLI - 52 tests passing, full docs + tutorial 29 - Security-first: read-only, secret redaction, no content storage Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Addresses gap #2: enterprises need birth-to-death management of agent identities with automated credential rotation and orphan detection. New module: packages/agent-mesh/src/agentmesh/lifecycle/ Core capabilities: - ManagedAgent model with lifecycle state machine (8 states) - LifecycleManager: request, approve/reject, activate, suspend, resume, decommission - LifecyclePolicy: configurable approval workflows, heartbeat intervals, credential TTLs - CredentialRotator: automatic short-lived credential rotation with overlap periods - OrphanDetector: find silent agents (missed heartbeats), unowned agents, stale agents - Full audit trail: every state transition recorded with actor, timestamp, details - State machine enforcement: invalid transitions are rejected - Persistence: JSON-backed storage for lifecycle state Also includes: - Tutorial 30 (agent lifecycle management) with state machine diagram - Updated tutorials README with discovery & lifecycle learning path - 24 tests passing (manager, credentials, orphan detection, persistence) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
🤖 AI Agent: docs-sync-checker — Issues Found📝 Documentation Sync ReportIssues Found
Suggestions
Please address these issues to ensure the documentation is complete and up-to-date. Let me know if you need further clarification or assistance. |
🤖 AI Agent: breaking-change-detector — Summary🔍 API Compatibility ReportSummaryThis pull request introduces new functionality related to agent lifecycle management and discovery. It includes new packages ( Findings
Migration Guide✅ No migration steps are required as no breaking changes were detected. Notes
If you have further questions or need additional analysis, feel free to ask! |
There was a problem hiding this comment.
🤖 AI Agent: code-reviewer
Pull Request Review: Agent Lifecycle Management and Discovery
This pull request introduces significant new functionality for agent lifecycle management (provisioning to decommissioning) and agent discovery. Below is a detailed review of the changes, focusing on the specified areas of concern.
🔴 CRITICAL: Security Issues
-
Credential Rotation and Revocation
- The
CredentialPolicyincludesauto_rotateandrevoke_on_decommission. However, there is no explicit verification that credentials are securely revoked from all systems where they might have been distributed. If an agent's credentials are cached or not properly invalidated, this could lead to unauthorized access.- Actionable Fix: Ensure that credential revocation propagates to all systems where the credentials might be in use. Add tests to verify that revoked credentials cannot be used post-decommission.
- The
-
Orphan Detection
- The
OrphanDetectorrelies on heartbeats to identify orphaned agents. However, there is no mention of cryptographic verification of these heartbeats. An attacker could spoof heartbeats to prevent an agent from being marked as orphaned.- Actionable Fix: Ensure all heartbeats are signed using the agent's private key and verified against its public key (e.g., SPIFFE/SVID). Add tests to validate this behavior.
- The
-
Shadow Agent Risk Scoring
- The
RiskScorerassigns points for missing identity (e.g., no DID/SPIFFE). However, there is no mechanism to verify the authenticity of the discovered agents' identities. This could lead to false negatives if an attacker spoofs a valid identity.- Actionable Fix: Add a mechanism to verify the authenticity of discovered identities, such as checking against a trusted certificate authority or registry.
- The
-
Process Scanner Redaction
- The
ProcessScannerclaims to redact sensitive information (e.g., API keys, tokens, JWTs) from command-line arguments. However, there is no evidence of robust testing for edge cases (e.g., obfuscated or encoded secrets).- Actionable Fix: Add comprehensive tests to ensure all sensitive information is reliably detected and redacted, including edge cases like base64-encoded secrets.
- The
🟡 WARNING: Potential Breaking Changes
-
Lifecycle Manager API
- The
LifecycleManagerintroduces new methods (request_provisioning,approve,activate, etc.) that may not be backward-compatible with existing agent management workflows.- Actionable Fix: Clearly document these changes in the release notes and provide migration guides for users of the previous API.
- The
-
Agent Discovery CLI
- The new CLI commands (
agent-discovery scan,inventory,reconcile) introduce a new interface for users. This could lead to confusion if not properly documented.- Actionable Fix: Ensure the CLI commands are thoroughly documented, and consider providing examples for common use cases.
- The new CLI commands (
💡 Suggestions for Improvement
-
Thread Safety
- The
LifecycleManagerandAgentInventoryclasses appear to use file-based storage for state persistence. If these classes are used in a multi-threaded or multi-process environment, there could be race conditions.- Suggestion: Use file locks or a thread-safe database (e.g., SQLite) to ensure consistent state updates.
- The
-
Type Safety
- The
LifecyclePolicyandCredentialPolicyclasses use type annotations, but there is no evidence of runtime validation (e.g., Pydantic models).- Suggestion: Use Pydantic models for these classes to enforce type safety and validate input at runtime.
- The
-
Audit Trail Immutability
- The audit trail is described as "immutably recorded," but there is no mention of how immutability is enforced (e.g., append-only logs, cryptographic signatures).
- Suggestion: Use a tamper-evident mechanism (e.g., Merkle trees or blockchain) to ensure the integrity of the audit trail.
- The audit trail is described as "immutably recorded," but there is no mention of how immutability is enforced (e.g., append-only logs, cryptographic signatures).
-
Dependency Risk
- The
agent-discoverypackage introduces new dependencies (e.g., for GitHub scanning). These dependencies could introduce supply chain risks.- Suggestion: Perform a security audit of the new dependencies and consider making them optional.
- The
-
Backward Compatibility
- The addition of new tutorials and CLI commands is a positive step, but it may overwhelm new users.
- Suggestion: Provide a "Getting Started" guide that focuses on the most common use cases.
- The addition of new tutorials and CLI commands is a positive step, but it may overwhelm new users.
✅ Positive Aspects
-
Comprehensive Documentation
- The new tutorials (29 and 30) are well-written and provide clear, step-by-step instructions for using the new features.
-
Test Coverage
- The pull request mentions 24 new tests, which is a good indicator of quality. However, it would be helpful to see a summary of the test coverage.
-
Modular Design
- The
agent-discoverypackage uses a plugin architecture for scanners, making it easy to extend.
- The
-
OWASP Agentic Top 10 Compliance
- The focus on credential rotation, orphan detection, and audit trails aligns well with OWASP's recommendations for secure agent management.
Summary of Recommendations
Critical Fixes
- Implement robust credential revocation mechanisms.
- Cryptographically verify heartbeats to prevent spoofing.
- Authenticate discovered agent identities to prevent false negatives.
- Add comprehensive tests for redacting sensitive information in the
ProcessScanner.
Warnings
- Document API changes and provide migration guides.
- Ensure CLI commands are well-documented and user-friendly.
Suggestions
- Use thread-safe mechanisms for state persistence.
- Enforce type safety with Pydantic models.
- Implement tamper-evident audit trails.
- Audit new dependencies for supply chain risks.
- Create a "Getting Started" guide for new users.
Final Assessment
This pull request introduces valuable features for agent lifecycle management and discovery, addressing critical gaps in the Agent Governance Toolkit. However, several security issues and potential breaking changes need to be addressed before merging. Once these concerns are resolved, the new functionality will significantly enhance the toolkit's capabilities.
🤖 AI Agent: test-generator — `packages/agent-discovery/src/agent_discovery/__init__.py`🧪 Test Coverage Analysis
|
🤖 AI Agent: security-scanner — Security Analysis of the Pull RequestSecurity Analysis of the Pull RequestThis pull request introduces agent lifecycle management and shadow AI discovery functionalities to the Agent Governance Toolkit (AGT). While these features are critical for enhancing the governance of AI agents, they also introduce potential security risks. Below is a detailed analysis of the changes, categorized by the security concerns outlined in the prompt. 1. Prompt Injection Defense BypassNo evidence of prompt injection vulnerabilities was identified in the changes. The new features focus on agent lifecycle management and discovery, which do not directly involve user-supplied prompts or natural language processing. Rating: 🔵 LOW 2. Policy Engine CircumventionThe lifecycle management system introduces a policy engine for agent provisioning, approval, and decommissioning. However, there are potential risks: Finding 1: Lack of validation for
|
| Finding | Rating | Attack Vector | Recommendation |
|---|---|---|---|
Lack of validation for actor in provisioning and approval |
🔴 CRITICAL | Unauthorized users could provision or approve agents by impersonating an admin. | Validate actor against authorized users and use secure authentication mechanisms. |
| Lack of SPIFFE/SVID validation in activation | 🔴 CRITICAL | Malicious agents could obtain credentials without valid cryptographic identities. | Enforce SPIFFE/SVID validation during activation. |
| Insufficient redaction of sensitive data | 🟠 HIGH | Sensitive credentials could be exposed if not properly redacted from command-line arguments. | Add comprehensive redaction tests and use a dedicated redaction library. |
| Potential TOCTOU in credential rotation | 🟠 HIGH | Attackers could exploit gaps in the credential rotation process to use expired credentials. | Implement locking mechanisms and atomic operations for credential updates. |
Lack of sandboxing in agent-discovery |
🟠 HIGH | Malicious agents could inject code into processes or files, exploiting the scanner's lack of isolation. | Run the scanner in a sandboxed environment with restricted permissions. |
| Dependency management risks | 🟡 MEDIUM | Dependency confusion or typosquatting attacks could compromise the system. | Use dependency pinning and regularly audit dependencies for vulnerabilities. |
Final Recommendation
The pull request introduces valuable features but contains critical security risks that must be addressed before merging. Specifically:
- Implement validation for
actorfields in provisioning and approval workflows. - Enforce SPIFFE/SVID validation during agent activation.
- Address redaction, TOCTOU, and sandboxing issues to mitigate potential vulnerabilities.
After addressing these issues, the pull request can be re-evaluated for security compliance.
|
Shipped in v3.1.0! ✅ This is already getting great feedback from the CPG pilot team who needed credential rotation + orphan detection for their supply chain agents. |
Addresses gap #2: agent lifecycle management (provisioning, credential rotation, orphan detection, decommissioning). 24 tests, tutorial 30, full audit trail.