feat: Zero2Agent full architecture overhaul + comprehensive tests#82
Open
t0ugh-sys wants to merge 8 commits into
Open
feat: Zero2Agent full architecture overhaul + comprehensive tests#82t0ugh-sys wants to merge 8 commits into
t0ugh-sys wants to merge 8 commits into
Conversation
Round 12 — extract _read_messages() helper from JsonlTeamInboxStore. drain() and peek() had identical 10-line JSONL reading blocks. Now both delegate to _read_messages(), cutting ~15 lines. 300/300 tests passing.
…s/team_runtime/cli - __init__.py: convert 28 eager submodule imports to lazy __getattr__ pattern, reducing import startup cost significantly - skills.py: remove dead _skill_doc_path and _legacy_skill_doc_path wrappers - team_runtime.py: remove unused plan_approval_response enum value - cli.py: remove unused Dict import
Inspired by Zero2Agent (onefly.top/zero2Agent) best practices: - hooks.py: Wire PreToolUse/PostToolUse hooks into tool dispatch (was completely dead code, now integrated into _dispatch_tool_calls) - policies.py: Add LoopDetector (repeated tool+args detection) and TokenBudget (cumulative token usage guard) - tool_use_loop.py: Add Reflexion pattern — inject self-critique into state_summary after tool failures so decider can recover - tool_use_loop.py: Add 80% context threshold auto-compact (was only triggering at 100%) - background.py: Replace subprocess.run with Popen for streaming output; add read_output() and kill_task() methods All 300 tests pass.
Security: - hooks.py: Default to shlex.split instead of shell=True (new shell=False default, explicit shell=True opt-in) - search_tools.py: Expand SSRF blocklist with private IP ranges (127.x, 10.x, 192.168.x, 172.16-31.x, 0.0.0.0) Error handling: - core/agent.py: Log observer callback exceptions instead of silently swallowing them - github_tools.py: JSON parse failures now return ok=False instead of masking as ok=True (3 instances) - session.py: Log tail-window fast-path failures at DEBUG level Dead code: - tools/__init__.py: Remove unused builtin_tool_specs_map() - runtime.py: Remove unused import json All 300 tests pass.
…action Based on Zero2Agent interview best practices: - agent_protocol.py: Enhanced JSON repair for malformed LLM output (brace-matching extraction, trailing comma removal, single-quote replacement, regex fallback) - tools/base.py: Add dry_run flag to ToolContext - tools/file_tools.py: write_file and apply_patch support dry-run preview mode (returns what would change without executing) - team_runtime.py: Ping-pong detection in teammate loop — tracks consecutive messages from same sender, blocks at threshold (default 5) to prevent infinite inter-agent loops - tool_use_loop.py: Key constraint extraction from goal text — regex patterns detect rules, versions, deadlines, limits, paths and inject them into state_summary to prevent context drift All 300 tests pass.
…ions, coordinator mode Zero2Agent s06/s10/s11 improvements: - compression.py: add time_based_micro_compact — clear ALL tool results after 30min inactivity gap (prompt cache expired, old results waste context) - compression.py: add created_at timestamp to TranscriptEntry for gap detection - tool_use_loop.py: integrate time-based microcompact in compact flow - team_runtime.py: add plan_approval_request/response to TeamMessageType - team_runtime.py: add request_plan_approval, approve_plan, reject_plan methods - subagents.py: add TaskNotification dataclass with XML serialization - subagents.py: add build_notification() to SubAgentResult with usage stats - subagents.py: track duration_ms in run_once - subagents.py: add model override field to SubAgentSpec - prompts.py: add COORDINATOR_SYSTEM_PROMPT + COORDINATOR_TOOLS_SPEC
…rovements - token_estimation.py: fix estimate_messages — old code computed total_chars but never used it, always returned input_tokens regardless of content. Now stores total_chars from calibration and uses chars-per-token ratio. - token_estimation.py: add total_chars parameter to update_from_response - tests/test_zero2agent_improvements.py: 15 new tests covering: - time_based_micro_compact (6 tests) - TaskNotification XML/dict/build_notification (4 tests) - Plan approval protocol (3 tests) - Coordinator prompt + tools spec (2 tests) - Closed issues #74, #75 (already fixed in previous rounds)
- tests/test_compression.py: 28 tests covering micro_compact_messages, micro_compact_entries, group_messages_by_rounds, partial_compact, CompactConfig, CompactManager, archive, TranscriptEntry - tests/test_background.py: 9 tests covering BackgroundTaskInfo, BackgroundCommandRunner spawn/drain/snapshot/kill/read_output - tests/test_todo.py: 12 tests covering TodoItem, TodoManager write/ validation/snapshot, render_todo_lines, TodoSnapshot - Close issues #74, #75 (already fixed in previous rounds) - Fix issue #77: HybridTokenCounter.estimate_messages calibration bug Total: 365 tests (300 original + 65 new)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Based on comprehensive analysis of Zero2Agent tutorial (12 Coding Agent articles + agent basics + framework survey + interview questions), implementing all missing architectural patterns and adding comprehensive test coverage.
Changes (9 commits, 365 tests)
Architecture Improvements
Key Features Added
Bug Fixes
Test Coverage
Issues Closed