Suggestion: Context optimization — subagent delegation for heavy-query commands

# Context Optimization: Subagent Delegation for Heavy-Query Commands

This is a suggestion based on a modification we've done to our fork, that might be usefull for future versions.

## Problem

AdLoop's MCP server exposes 43+ tools. When a user runs a slash command like `/audit-account`, the command executes 7-13 MCP queries inline, dumping 15,000-30,000 tokens of raw API responses into the main conversation context.

With 191 total tools registered across all connected MCP servers, the base context overhead is already significant. Running 1-2 heavy commands in a single session can exhaust the context window, forcing premature compaction or preventing multi-step workflows.

**Concrete example:** `/audit-account` runs `get_campaign_performance`, `get_keyword_performance`, `get_search_terms`, `get_ad_performance`, `attribution_check`, and `get_tracking_events` — all inline. Each returns tabular data that's useful for analysis but doesn't need to persist in the conversation after the report is formatted.

## Solution: Subagent Delegation Pattern

Delegate query execution to a subagent (`Agent(model="sonnet")`) that:
1. Executes all MCP queries internally
2. Analyzes results
3. Returns a **structured JSON summary** (~500-800 tokens)

The main thread then:
1. Spot-checks 1-2 key metrics against the API directly (~200 tokens)
2. Formats the JSON into a user-friendly markdown report

**Net result: ~95-97% context reduction per command in the main thread.**

### Architecture

```
┌──────────────────────────┐
│     Main Thread           │  ~800-1000 tokens per command
│                           │
│  1. Agent(prompt)─────────┼──▶ Subagent (sonnet, 20-25 turns)
│                           │     Executes 7-13 MCP queries
│                           │     Returns structured JSON
│                           │
│  2. Spot-check API ───────┼──▶ 1-2 direct queries (~200 tok)
│                           │     Validates key metrics
│                           │
│  3. Format report ────────┤     JSON → markdown
│                           │
│  4. Fallback ─────────────┼──▶ If subagent fails → inline (original)
└──────────────────────────┘
```

## Implementation Guide

### Step 1: Create an Agent Definition File

Create `.claude/agents/<agent-name>.md` with structured frontmatter that documents the agent's role, tools, process, and output schema:

```markdown
---
model: sonnet
maxTurns: 25
permissionMode: auto
tools: Read, Bash, Grep, Glob
---

You are an Ads + GA4 data analyst. You execute MCP queries and return structured JSON summaries.

## Process
1. Execute queries in order
2. Analyze results
3. Return JSON matching the output schema

## Output Schema
{health_score: 0-100, campaigns: {total, active, ...}, keywords: {...}, recommendations: [...]}
```

**Important:** These `.claude/agents/*.md` files are **documentation**, not a `subagent_type` registry. The `Agent` tool only accepts predefined types (`general-purpose`, `Explore`, etc.). You embed the full agent prompt inline in the command file.

### Step 2: Rewrite the Command

Transform the command from inline execution to subagent delegation. Here's a complete before/after:

#### BEFORE: Inline queries (original)

```markdown
# Audit Account

Run a full account audit. $ARGUMENTS

## Execution

1. Run `get_campaign_performance` (last 30 days)
2. Run `get_keyword_performance` (last 30 days)
3. Run `get_search_terms` (last 30 days)
4. Run `get_ad_performance` (last 30 days)
5. Run `attribution_check` (last 30 days)
6. Run `get_tracking_events` (last 28 days)

Analyze results and present a report with:
- Health score
- Campaign table
- Keyword issues
- Recommendations
```

This injects 15-30K tokens of raw data into the conversation.

#### AFTER: Subagent delegation

```markdown
# Audit Account (subagent-optimized)

Run a full account audit using a subagent to keep raw data out of the conversation. $ARGUMENTS

## Step 1: Delegate to subagent

Invoke the `Agent` tool with these parameters:
- **model**: sonnet
- **prompt**: (full task prompt below)

**Task prompt for the subagent:**

\```
You are a Google Ads + GA4 data analyst. Execute the following queries and return a structured JSON summary.

## Queries to execute (in order):
1. `mcp__adloop__health_check` — verify connectivity
2. `mcp__adloop__get_campaign_performance` (last 30 days)
3. `mcp__adloop__get_keyword_performance` (last 30 days)
4. `mcp__adloop__get_search_terms` (last 30 days)
5. `mcp__adloop__get_ad_performance` (last 30 days)
6. `mcp__adloop__attribution_check` (last 30 days)
7. `mcp__adloop__get_tracking_events` (last 28 days)

## Analysis to perform:
- Calculate health score (0-100) based on: campaign status, QS distribution, conversion tracking, ad coverage
- Identify critical campaigns (high spend, low/no conversions)
- Find keyword issues (low QS, zero-conv high clicks, broad match risk)
- Detect wasted search terms and expansion opportunities
- Check conversion tracking health (Ads vs GA4 discrepancy)
- Verify ad coverage (single-ad ad groups, incomplete RSAs)

## Output format:

Return a JSON object:
- health_score: number 0-100
- campaigns: {total, active, total_spend, total_conversions, avg_cpa, critical: [], bidding_issues: []}
- keywords: {total, low_quality: [], zero_conv_high_clicks: [], broad_risk: []}
- search_terms: {waste_candidates: [], expansion_opportunities: []}
- tracking: {ads_conversions, ga4_conversions, discrepancy_pct, missing_events: []}
- ad_coverage: {single_ad_groups, incomplete_rsas, missing_sitelinks}
- recommendations: [{priority, title, action, tool, estimated_impact}]
- warnings: []

Limit arrays to top 10-15 items. Round currency to 2 decimals, percentages to 1 decimal.

IMPORTANT: Return ONLY the JSON object, no additional commentary.
\```

## Step 2: Validate with spot-check

After receiving the JSON, verify 1 key metric:
- Run `mcp__adloop__get_campaign_performance` for the top-spending campaign
- Compare CPA with subagent's report
- If discrepancy > 10%, add ⚠️ warning

## Step 3: Present findings

Format into a user-friendly report:

\```
## Account Audit — Health Score: X/100

### Critical Issues
[campaigns with issues]

### Campaign Performance
| Campaign | Status | Spend | Conversions | CPA |
|----------|--------|-------|-------------|-----|

### Keyword Issues
[low QS, zero conv, broad risk]

### Recommendations (prioritized)
1. [action] — [estimated impact]
\```

## Fallback

If the subagent fails or times out:
1. Inform the user: "Subagent failed, running inline analysis..."
2. Execute queries directly in main context (original behavior)
```

### Step 3: Spot-Check Validation Pattern

The spot-check is critical to prevent hallucinated metrics. After the subagent returns JSON:

```
1. Pick 1-2 key metrics (e.g., top campaign CPA, top keyword cost)
2. Run 1 lightweight API query directly
3. Compare — if >10% discrepancy, flag with ⚠️
4. Cost: ~200 tokens for the verification query
```

This gives you the quality guarantee of inline execution with the context savings of delegation.

## Measured Results

We validated this pattern with 5 commands against live production data:

| Command | Queries | Raw Data (est.) | Subagent JSON | Spot-check | Total New | Reduction |
|---------|---------|-----------------|---------------|------------|-----------|-----------|
| audit-account | 7 | ~15K tok | ~600 tok | ~200 tok | ~800 tok | ~95% |
| audit-account-gsc | 13 | ~30K tok | ~800 tok | ~200 tok | ~1K tok | ~97% |
| seo-content-opportunities | 5-8 | ~20K tok | ~700 tok | ~200 tok | ~900 tok | ~96% |
| seo-paid-cannibalization | 4 | ~15K tok | ~600 tok | ~200 tok | ~800 tok | ~95% |
| gtm (audit mode) | 7-9 | ~20K tok | ~700 tok | ~200 tok | ~900 tok | ~96% |

### Quality Validation

Each command was spot-checked against direct API queries:

| Metric Checked | Max Deviation | Verdict |
|---------------|---------------|---------|
| Campaign CPA | <2% | ✅ Exact |
| Organic search positions | <3% | ✅ Within tolerance |
| Keyword costs | <5% | ✅ Within tolerance |
| GTM tag inventory | <3% | ✅ 1 item missed in 33 |

**No regression in insight quality.** Health scores, recommendations, and action items are identical to inline execution.

## Design Decisions

### Why `model: sonnet` for subagents?

The analysis tasks (read queries, compute metrics, format JSON) don't require Opus-level reasoning. Sonnet handles them reliably at lower cost and latency. The main thread uses whatever model the user has selected.

### Why prompt-based schema enforcement?

The `Agent` tool supports a `schema` parameter for structured output, but it requires a specific JSON Schema format and adds complexity. Since these are single-shot analysis tasks, describing the expected JSON format in the prompt is simpler and equally effective in practice.

### Why not delegate ALL commands?

Commands that require **interactive decisions** (e.g., "which tags should I create?", "confirm this bid change") can't be delegated — the subagent can't ask the user questions. We use a **hybrid approach**:

- **Read-heavy commands** (audit, analysis, reports) → subagent delegation
- **Interactive commands** (setup, fix, create) → inline execution

### Why the fallback?

If the subagent times out or fails, the command falls back to inline execution (the original behavior). This ensures the command never becomes a blocker. In practice, we haven't seen a fallback triggered in testing.

## How to Adopt

1. **Identify heavy-query commands** in your fork — any command that runs 3+ MCP queries inline
2. **Create agent definition files** in `.claude/agents/` (for documentation/reference)
3. **Rewrite commands** using the before/after template above
4. **Add spot-check validation** — pick 1-2 key metrics to verify
5. **Test with live data** — compare subagent output vs inline execution
6. **Add fallback** — keep the original inline path as fallback

### Files to Create/Modify

```
.claude/
├── agents/
│   ├── ads-query-analyst.md    # Agent definition (documentation)
│   ├── schemas.md              # Shared JSON schemas
│   └── ...
└── commands/
    ├── audit-account.md        # Rewritten with delegation
    ├── audit-account-gsc.md    # Rewritten with delegation
    └── ...
```

## Limitations

1. **No iterative follow-up**: The subagent returns a snapshot. You can't ask "tell me more about that campaign" without re-running the subagent or querying inline.
2. **Token cost is similar**: Total tokens consumed (main + subagent) are roughly the same as inline. The savings are in the main thread's context window, not in total API cost.
3. **Subagents can miss items**: In large inventories (e.g., 33+ GTM tags), a subagent may occasionally omit minor items. The spot-check catches this.
4. **Cross-server tool access**: Subagents can use all MCP tools available in the session, but if a server isn't connected (e.g., GTM not active), the subagent must handle the missing tools gracefully.

## Tags for Reference

- Pre-change: `pre-subagent-refactor`
- Post-validation: `post-subagent-refactor`

---

**Would you be open to integrating this pattern into the upstream AdLoop commands?** I'm happy to submit a PR with the changes if there's interest, or refine the approach based on feedback.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Suggestion: Context optimization — subagent delegation for heavy-query commands #45

Context Optimization: Subagent Delegation for Heavy-Query Commands

Problem

Solution: Subagent Delegation Pattern

Architecture

Implementation Guide

Step 1: Create an Agent Definition File

Step 2: Rewrite the Command

BEFORE: Inline queries (original)

AFTER: Subagent delegation

Step 3: Spot-Check Validation Pattern

Measured Results

Quality Validation

Design Decisions

Why `model: sonnet` for subagents?

Why prompt-based schema enforcement?

Why not delegate ALL commands?

Why the fallback?

How to Adopt

Files to Create/Modify

Limitations

Tags for Reference

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Command	Queries	Raw Data (est.)	Subagent JSON	Spot-check	Total New	Reduction
audit-account	7	~15K tok	~600 tok	~200 tok	~800 tok	~95%
audit-account-gsc	13	~30K tok	~800 tok	~200 tok	~1K tok	~97%
seo-content-opportunities	5-8	~20K tok	~700 tok	~200 tok	~900 tok	~96%
seo-paid-cannibalization	4	~15K tok	~600 tok	~200 tok	~800 tok	~95%
gtm (audit mode)	7-9	~20K tok	~700 tok	~200 tok	~900 tok	~96%

Metric Checked	Max Deviation	Verdict
Campaign CPA	<2%	✅ Exact
Organic search positions	<3%	✅ Within tolerance
Keyword costs	<5%	✅ Within tolerance
GTM tag inventory	<3%	✅ 1 item missed in 33

Suggestion: Context optimization — subagent delegation for heavy-query commands #45

Description

Context Optimization: Subagent Delegation for Heavy-Query Commands

Problem

Solution: Subagent Delegation Pattern

Architecture

Implementation Guide

Step 1: Create an Agent Definition File

Step 2: Rewrite the Command

BEFORE: Inline queries (original)

AFTER: Subagent delegation

Step 3: Spot-Check Validation Pattern

Measured Results

Quality Validation

Design Decisions

Why model: sonnet for subagents?

Why prompt-based schema enforcement?

Why not delegate ALL commands?

Why the fallback?

How to Adopt

Files to Create/Modify

Limitations

Tags for Reference

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

Why `model: sonnet` for subagents?