Context Optimization: Subagent Delegation for Heavy-Query Commands
This is a suggestion based on a modification we've done to our fork, that might be usefull for future versions.
Problem
AdLoop's MCP server exposes 43+ tools. When a user runs a slash command like /audit-account, the command executes 7-13 MCP queries inline, dumping 15,000-30,000 tokens of raw API responses into the main conversation context.
With 191 total tools registered across all connected MCP servers, the base context overhead is already significant. Running 1-2 heavy commands in a single session can exhaust the context window, forcing premature compaction or preventing multi-step workflows.
Concrete example: /audit-account runs get_campaign_performance, get_keyword_performance, get_search_terms, get_ad_performance, attribution_check, and get_tracking_events — all inline. Each returns tabular data that's useful for analysis but doesn't need to persist in the conversation after the report is formatted.
Solution: Subagent Delegation Pattern
Delegate query execution to a subagent (Agent(model="sonnet")) that:
- Executes all MCP queries internally
- Analyzes results
- Returns a structured JSON summary (~500-800 tokens)
The main thread then:
- Spot-checks 1-2 key metrics against the API directly (~200 tokens)
- Formats the JSON into a user-friendly markdown report
Net result: ~95-97% context reduction per command in the main thread.
Architecture
┌──────────────────────────┐
│ Main Thread │ ~800-1000 tokens per command
│ │
│ 1. Agent(prompt)─────────┼──▶ Subagent (sonnet, 20-25 turns)
│ │ Executes 7-13 MCP queries
│ │ Returns structured JSON
│ │
│ 2. Spot-check API ───────┼──▶ 1-2 direct queries (~200 tok)
│ │ Validates key metrics
│ │
│ 3. Format report ────────┤ JSON → markdown
│ │
│ 4. Fallback ─────────────┼──▶ If subagent fails → inline (original)
└──────────────────────────┘
Implementation Guide
Step 1: Create an Agent Definition File
Create .claude/agents/<agent-name>.md with structured frontmatter that documents the agent's role, tools, process, and output schema:
---
model: sonnet
maxTurns: 25
permissionMode: auto
tools: Read, Bash, Grep, Glob
---
You are an Ads + GA4 data analyst. You execute MCP queries and return structured JSON summaries.
## Process
1. Execute queries in order
2. Analyze results
3. Return JSON matching the output schema
## Output Schema
{health_score: 0-100, campaigns: {total, active, ...}, keywords: {...}, recommendations: [...]}
Important: These .claude/agents/*.md files are documentation, not a subagent_type registry. The Agent tool only accepts predefined types (general-purpose, Explore, etc.). You embed the full agent prompt inline in the command file.
Step 2: Rewrite the Command
Transform the command from inline execution to subagent delegation. Here's a complete before/after:
BEFORE: Inline queries (original)
# Audit Account
Run a full account audit. $ARGUMENTS
## Execution
1. Run `get_campaign_performance` (last 30 days)
2. Run `get_keyword_performance` (last 30 days)
3. Run `get_search_terms` (last 30 days)
4. Run `get_ad_performance` (last 30 days)
5. Run `attribution_check` (last 30 days)
6. Run `get_tracking_events` (last 28 days)
Analyze results and present a report with:
- Health score
- Campaign table
- Keyword issues
- Recommendations
This injects 15-30K tokens of raw data into the conversation.
AFTER: Subagent delegation
# Audit Account (subagent-optimized)
Run a full account audit using a subagent to keep raw data out of the conversation. $ARGUMENTS
## Step 1: Delegate to subagent
Invoke the `Agent` tool with these parameters:
- **model**: sonnet
- **prompt**: (full task prompt below)
**Task prompt for the subagent:**
\```
You are a Google Ads + GA4 data analyst. Execute the following queries and return a structured JSON summary.
## Queries to execute (in order):
1. `mcp__adloop__health_check` — verify connectivity
2. `mcp__adloop__get_campaign_performance` (last 30 days)
3. `mcp__adloop__get_keyword_performance` (last 30 days)
4. `mcp__adloop__get_search_terms` (last 30 days)
5. `mcp__adloop__get_ad_performance` (last 30 days)
6. `mcp__adloop__attribution_check` (last 30 days)
7. `mcp__adloop__get_tracking_events` (last 28 days)
## Analysis to perform:
- Calculate health score (0-100) based on: campaign status, QS distribution, conversion tracking, ad coverage
- Identify critical campaigns (high spend, low/no conversions)
- Find keyword issues (low QS, zero-conv high clicks, broad match risk)
- Detect wasted search terms and expansion opportunities
- Check conversion tracking health (Ads vs GA4 discrepancy)
- Verify ad coverage (single-ad ad groups, incomplete RSAs)
## Output format:
Return a JSON object:
- health_score: number 0-100
- campaigns: {total, active, total_spend, total_conversions, avg_cpa, critical: [], bidding_issues: []}
- keywords: {total, low_quality: [], zero_conv_high_clicks: [], broad_risk: []}
- search_terms: {waste_candidates: [], expansion_opportunities: []}
- tracking: {ads_conversions, ga4_conversions, discrepancy_pct, missing_events: []}
- ad_coverage: {single_ad_groups, incomplete_rsas, missing_sitelinks}
- recommendations: [{priority, title, action, tool, estimated_impact}]
- warnings: []
Limit arrays to top 10-15 items. Round currency to 2 decimals, percentages to 1 decimal.
IMPORTANT: Return ONLY the JSON object, no additional commentary.
\```
## Step 2: Validate with spot-check
After receiving the JSON, verify 1 key metric:
- Run `mcp__adloop__get_campaign_performance` for the top-spending campaign
- Compare CPA with subagent's report
- If discrepancy > 10%, add ⚠️ warning
## Step 3: Present findings
Format into a user-friendly report:
\```
## Account Audit — Health Score: X/100
### Critical Issues
[campaigns with issues]
### Campaign Performance
| Campaign | Status | Spend | Conversions | CPA |
|----------|--------|-------|-------------|-----|
### Keyword Issues
[low QS, zero conv, broad risk]
### Recommendations (prioritized)
1. [action] — [estimated impact]
\```
## Fallback
If the subagent fails or times out:
1. Inform the user: "Subagent failed, running inline analysis..."
2. Execute queries directly in main context (original behavior)
Step 3: Spot-Check Validation Pattern
The spot-check is critical to prevent hallucinated metrics. After the subagent returns JSON:
1. Pick 1-2 key metrics (e.g., top campaign CPA, top keyword cost)
2. Run 1 lightweight API query directly
3. Compare — if >10% discrepancy, flag with ⚠️
4. Cost: ~200 tokens for the verification query
This gives you the quality guarantee of inline execution with the context savings of delegation.
Measured Results
We validated this pattern with 5 commands against live production data:
| Command |
Queries |
Raw Data (est.) |
Subagent JSON |
Spot-check |
Total New |
Reduction |
| audit-account |
7 |
~15K tok |
~600 tok |
~200 tok |
~800 tok |
~95% |
| audit-account-gsc |
13 |
~30K tok |
~800 tok |
~200 tok |
~1K tok |
~97% |
| seo-content-opportunities |
5-8 |
~20K tok |
~700 tok |
~200 tok |
~900 tok |
~96% |
| seo-paid-cannibalization |
4 |
~15K tok |
~600 tok |
~200 tok |
~800 tok |
~95% |
| gtm (audit mode) |
7-9 |
~20K tok |
~700 tok |
~200 tok |
~900 tok |
~96% |
Quality Validation
Each command was spot-checked against direct API queries:
| Metric Checked |
Max Deviation |
Verdict |
| Campaign CPA |
<2% |
✅ Exact |
| Organic search positions |
<3% |
✅ Within tolerance |
| Keyword costs |
<5% |
✅ Within tolerance |
| GTM tag inventory |
<3% |
✅ 1 item missed in 33 |
No regression in insight quality. Health scores, recommendations, and action items are identical to inline execution.
Design Decisions
Why model: sonnet for subagents?
The analysis tasks (read queries, compute metrics, format JSON) don't require Opus-level reasoning. Sonnet handles them reliably at lower cost and latency. The main thread uses whatever model the user has selected.
Why prompt-based schema enforcement?
The Agent tool supports a schema parameter for structured output, but it requires a specific JSON Schema format and adds complexity. Since these are single-shot analysis tasks, describing the expected JSON format in the prompt is simpler and equally effective in practice.
Why not delegate ALL commands?
Commands that require interactive decisions (e.g., "which tags should I create?", "confirm this bid change") can't be delegated — the subagent can't ask the user questions. We use a hybrid approach:
- Read-heavy commands (audit, analysis, reports) → subagent delegation
- Interactive commands (setup, fix, create) → inline execution
Why the fallback?
If the subagent times out or fails, the command falls back to inline execution (the original behavior). This ensures the command never becomes a blocker. In practice, we haven't seen a fallback triggered in testing.
How to Adopt
- Identify heavy-query commands in your fork — any command that runs 3+ MCP queries inline
- Create agent definition files in
.claude/agents/ (for documentation/reference)
- Rewrite commands using the before/after template above
- Add spot-check validation — pick 1-2 key metrics to verify
- Test with live data — compare subagent output vs inline execution
- Add fallback — keep the original inline path as fallback
Files to Create/Modify
.claude/
├── agents/
│ ├── ads-query-analyst.md # Agent definition (documentation)
│ ├── schemas.md # Shared JSON schemas
│ └── ...
└── commands/
├── audit-account.md # Rewritten with delegation
├── audit-account-gsc.md # Rewritten with delegation
└── ...
Limitations
- No iterative follow-up: The subagent returns a snapshot. You can't ask "tell me more about that campaign" without re-running the subagent or querying inline.
- Token cost is similar: Total tokens consumed (main + subagent) are roughly the same as inline. The savings are in the main thread's context window, not in total API cost.
- Subagents can miss items: In large inventories (e.g., 33+ GTM tags), a subagent may occasionally omit minor items. The spot-check catches this.
- Cross-server tool access: Subagents can use all MCP tools available in the session, but if a server isn't connected (e.g., GTM not active), the subagent must handle the missing tools gracefully.
Tags for Reference
- Pre-change:
pre-subagent-refactor
- Post-validation:
post-subagent-refactor
Would you be open to integrating this pattern into the upstream AdLoop commands? I'm happy to submit a PR with the changes if there's interest, or refine the approach based on feedback.
Context Optimization: Subagent Delegation for Heavy-Query Commands
This is a suggestion based on a modification we've done to our fork, that might be usefull for future versions.
Problem
AdLoop's MCP server exposes 43+ tools. When a user runs a slash command like
/audit-account, the command executes 7-13 MCP queries inline, dumping 15,000-30,000 tokens of raw API responses into the main conversation context.With 191 total tools registered across all connected MCP servers, the base context overhead is already significant. Running 1-2 heavy commands in a single session can exhaust the context window, forcing premature compaction or preventing multi-step workflows.
Concrete example:
/audit-accountrunsget_campaign_performance,get_keyword_performance,get_search_terms,get_ad_performance,attribution_check, andget_tracking_events— all inline. Each returns tabular data that's useful for analysis but doesn't need to persist in the conversation after the report is formatted.Solution: Subagent Delegation Pattern
Delegate query execution to a subagent (
Agent(model="sonnet")) that:The main thread then:
Net result: ~95-97% context reduction per command in the main thread.
Architecture
Implementation Guide
Step 1: Create an Agent Definition File
Create
.claude/agents/<agent-name>.mdwith structured frontmatter that documents the agent's role, tools, process, and output schema:Important: These
.claude/agents/*.mdfiles are documentation, not asubagent_typeregistry. TheAgenttool only accepts predefined types (general-purpose,Explore, etc.). You embed the full agent prompt inline in the command file.Step 2: Rewrite the Command
Transform the command from inline execution to subagent delegation. Here's a complete before/after:
BEFORE: Inline queries (original)
This injects 15-30K tokens of raw data into the conversation.
AFTER: Subagent delegation
Step 3: Spot-Check Validation Pattern
The spot-check is critical to prevent hallucinated metrics. After the subagent returns JSON:
This gives you the quality guarantee of inline execution with the context savings of delegation.
Measured Results
We validated this pattern with 5 commands against live production data:
Quality Validation
Each command was spot-checked against direct API queries:
No regression in insight quality. Health scores, recommendations, and action items are identical to inline execution.
Design Decisions
Why
model: sonnetfor subagents?The analysis tasks (read queries, compute metrics, format JSON) don't require Opus-level reasoning. Sonnet handles them reliably at lower cost and latency. The main thread uses whatever model the user has selected.
Why prompt-based schema enforcement?
The
Agenttool supports aschemaparameter for structured output, but it requires a specific JSON Schema format and adds complexity. Since these are single-shot analysis tasks, describing the expected JSON format in the prompt is simpler and equally effective in practice.Why not delegate ALL commands?
Commands that require interactive decisions (e.g., "which tags should I create?", "confirm this bid change") can't be delegated — the subagent can't ask the user questions. We use a hybrid approach:
Why the fallback?
If the subagent times out or fails, the command falls back to inline execution (the original behavior). This ensures the command never becomes a blocker. In practice, we haven't seen a fallback triggered in testing.
How to Adopt
.claude/agents/(for documentation/reference)Files to Create/Modify
Limitations
Tags for Reference
pre-subagent-refactorpost-subagent-refactorWould you be open to integrating this pattern into the upstream AdLoop commands? I'm happy to submit a PR with the changes if there's interest, or refine the approach based on feedback.