[MOCK][DO NOT MERGE] Future work. Desktop Mode Agents. by AllTerrainDeveloper · Pull Request #240 · WordPress/desktop-mode

AllTerrainDeveloper · 2026-05-18T12:25:18Z

The pitch

A WordPress site is a tool the user operates. Agents make it a tool that operates with the user: durable, addressable workers that live on the site, take orders by chat, drag, hook, or HTTP, and use the same APIs a human admin would.

This PR ships the navigation surface and visual contract. Everything described below is what we build on top of it, in the order described, behind this UI.

What an Agent is — three layers, by design

We split an Agent across three existing WordPress primitives instead of inventing a new one. This is the architectural decision that lets every other piece compose cleanly.

Layer 1 — Identity: a WordPress user

Each Agent is a real row in wp_users with three constraints:

A role (administrator, editor, author, contributor). Capabilities are real WP capabilities; an Agent that runs wp_insert_post() is gated by current_user_can( 'edit_posts' ) exactly like a human.
No login. The authenticate filter rejects them, password resets are disabled, cookie auth refuses them. They exist only as actors the site invokes on its own behalf.
An identity surface (avatar, display name, attribution on revisions, comments, audit logs). When an Agent edits a post, the post's _edit_lock and revision author show the Agent — just like a human collaborator.

Layer 2 — Behavior: the `wp_guideline` CPT (portable) @artpi

The Agent's wp_guideline post IS its behavior. Every field that shapes how the Agent thinks, every toggle that changes what it can do, every list of supplemental knowledge it can reach for — all of it reads from and writes to a single wp_guideline post. There is no parallel Desktop-Mode table for prompts, no separate options row for the tool list, no shadow registry for skills. One post per Agent, and that post is the agent definition.

This is the CPT the broader Automattic agent ecosystem (Dolly, Push MD, the in-tree Guidelines experiment) already uses, which is why this works.

Concretely, every behavior-shaping change in the Desktop Mode UI lands as a write to this one post:

What the user does in the UI	What happens in storage
Edits the system prompt	`wp_update_post()` on the guideline's `post_content`
Ticks a checkbox to enable an ability (`wordpress/list-posts`)	`add_post_meta( $guideline_id, '_agent_abilities', 'wordpress/list-posts' )`
Unticks a checkbox to disable an ability	`delete_post_meta( $guideline_id, '_agent_abilities', 'wordpress/list-posts' )`
Attaches a skill (`writing/headline-style`)	Link the child `wp_guideline` to the parent — Dolly's existing relationship model, no new schema
Detaches a skill	Remove that link
Edits a skill's body	`wp_update_post()` on the child guideline (the skill is its own post; the agent just references it)

Nothing about the agent's brain lives outside wp_guideline. Pull up the agent's guideline post in any tool that speaks WP REST (Gutenberg, wp-cli, Push MD, an external script) and you have the entire behavior surface — prompt, tool toggles, attached skills — editable in one place, revisable through the standard editorial UI, auditable via the standard revisions table.

And because skills are themselves wp_guideline posts, this composes recursively: a skill can be used by many agents; an agent can mix-and-match skills from many authors; a single skill update propagates to every agent referencing it. Same pattern Dolly ships.

This layer is fully portable. Nothing in any of these fields is Desktop-Mode-specific — every value is something Claude Code, Codex, Cursor, or any other agent runtime understands natively. pushmd pull the site and the agent's brain materialises into the consumer's skills/ folder verbatim, tool toggles and all.

Layer 3 — Bindings: user meta on the Agent (site-specific)

Everything about how this site invokes the Agent lives as user meta on the Agent's wp_users row. The fields here are intentionally outside wp_guideline because they would be meaningless to consumers that aren't Desktop Mode:

Triggers: which WP hook the Agent subscribes to (save_post, wp_insert_comment, …), which REST endpoint it exposes and under what auth, which drop payloads its tile accepts, which agents it chains to.
Runtime overrides: per-Agent model selection that beats the platform default; per-Agent rate limits; per-Agent debug flags.

This is the layer Claude Code in a terminal doesn't have an opinion about — it invokes agents directly, no hook subscription, no REST gateway. Keeping bindings out of the guideline means the guideline travels, the bindings don't: clone the site, get the agent's brain; configure how your site invokes it separately. Same agent definition, different invocation policy per environment.

The split, in one line

wp_users row = who. wp_guideline post = what it does (prompt + tools + skills, all of it). user meta = how this site reaches it.

Updates to bindings don't touch behavior; updates to behavior don't touch identity; rename the identity and the other two don't move. And critically: if a user clicks anything in the agent's Define / Tools / Skills UI, the only thing that changes on disk is one wp_guideline post.

Why this split matters: free ecosystem compatibility

Because behavior lives in wp_guideline, every Agent we ship is automatically discoverable by any AI client that already speaks this CPT — including Claude Code, Codex, and anything in the Automattic agent ecosystem. The mechanism (already in production via pushmd.blog): your WordPress site becomes a Git remote, every guideline materialises as wp_guideline/skills/{slug}/SKILL.md with an AGENTS.md alias, and a local git clone of the site drops a working skills/ folder into the consumer's checkout. Claude Code reads it. Codex reads it. Cursor reads it. No bespoke integration, no separate sync layer, no second source of truth.

The agent you build inside Desktop Mode shows up in your terminal the moment you pushmd pull. The agent your developer hand-writes as a .md file in their checkout shows up in Desktop Mode the moment they git push. Same artifact, two front-ends.

How you talk to an Agent — five triggers, one mental model

Every interaction with an Agent is a trigger the user configures up front. The five triggers we plan to ship:

Trigger	Looks like	Use case
Drag & drop	Drop a media tile (or post, user, comment) onto the Agent's tile	"Remove the background from these 12 images."
Chat	Double-click the Agent → conversation window	"Audit this post and tighten the headings."
Hook	Subscribe to a WP action (`save_post`, `wp_insert_comment`, …)	"Moderate every new comment automatically."
REST endpoint	Authenticated or anonymous `POST /agents/v1/<slug>`	"Trigger a newsletter send from our build pipeline."
Agent-to-agent	One Agent's output feeds another's input	"When SEO scoring finishes, hand the verdict to the publishing Agent."

All five collapse to the same loop: a message arrives → the Agent's system prompt + the message become an LLM call → the model picks tools off the allowlist → tools run as the Agent's user → the result is the trigger's return value. Drag-and-drop is a chat with a media payload. A hook subscription is a chat where the message is the hook args. An endpoint is a chat where the body is the message. Same engine, different intake.

Trigger configuration is user meta on the Agent (Layer 3), not part of the guideline. Two reasons: (1) triggers are site-specific — the same "Moderate Comments" guideline can sit on one site that listens to wp_insert_comment and another that only exposes the REST endpoint, with no fork of the underlying agent definition; (2) triggers are a Desktop-Mode concept that wouldn't round-trip cleanly through pushmd / Claude Code / Codex anyway, so they don't belong in the layer those tools consume.

Tools = the WordPress Abilities API

WordPress 6.9 introduced wp_register_ability() — Core's first-party way to expose typed, schema-described actions to AI tooling. Agents read that registry at runtime: every ability becomes a candidate tool with its declared parameters schema converted to the OpenAI / Anthropic / Gemini function-calling shape. The user picks which ones each Agent gets, and the picks live as post meta on the guideline.

Three properties that fall out of this:

Ecosystem leverage. Any plugin that registers an ability becomes Agent-compatible. WooCommerce abilities → commerce Agents. Yoast abilities → SEO Agents. No bespoke integration code per plugin.
Capability gating for free. Each ability already carries its own permission_callback. The Agent runs as itself (its WP user), so the same checks that protect a human editor protect the Agent.
Tool palette UX. The checkbox list is a view over wp_get_abilities(). As Core grows that registry, every Agent's picker grows with it.

For abilities not yet exposed by Core or plugins, the existing desktop_mode_register_ai_tool() registry plugs the gap and feeds the same picker.

This Abilities-to-tools mapping is the piece the broader ecosystem doesn't have yet. It's what turns "an agent that knows things" (skills, instructions) into "an agent that does things" (tool calls against a typed schema).

LLM provider — bring your own

Agents need a model. We already ship a provider registry (desktop_mode_register_ai_provider) that supports OpenAI's Responses API today and is structured for Anthropic, Gemini, and any vendor a plugin author wires up. Two implications for Agents:

No LLM key, no Agent creation. The "Create Agent" button greys out with a notice pointing at the OS Settings AI panel. The section still renders — it's a hint, not a hard gate — and existing Agents still appear so users can audit them before installing a key.
Per-Agent model overrides. Some Agents are cheap classification jobs (Moderate Comments → Haiku-tier); some need real reasoning (Audit Post → Sonnet-tier). Model choice is a binding (Layer 3, user meta) rather than part of the portable guideline — what model you pay for on this site has no business travelling with the agent definition.

Drag-and-drop is the North Star

This is the feature that proves the desktop metaphor isn't a skin. "Drag this image onto the Remove BG agent" is a sentence non-technical users say out loud and expect to work. The cross-window drag bridge needed for Media-into-Gutenberg is the same machinery needed for tile-into-Agent — so every Agent we ship pulls the bridge closer to finished.

What it requires on the Agent side: an accepted-payload manifest in the drag-trigger binding (MIME types, entity kinds, post types) and a drop handler that converts the dropped payload into the chat message format. The framework already knows how to draw the ghost, hit-test windows, and route the drop — Agents are just one more drop target type.

Security model — boring on purpose

A new actor that can take HTTP requests, listen to hooks, and call tools is a security surface. The boring-on-purpose answer:

Agents are users. Every action lands in WordPress's existing audit trail with the Agent's user ID as the actor. No new auth model, no parallel ACL.
Capabilities are inherited, not granted. Selecting a tool from the abilities picker does not elevate the Agent. If the Agent's role can't run the underlying ability, the call fails the same way a human's would.
Trigger gating is explicit. Each trigger carries its own auth (capability for chat, REST permission for endpoints, hook firing user for actions). The Agent never runs un-gated.
No outbound creds in payloads. Tool results are sanitised before going back into the LLM context. The model sees what the editor would see in the wp-admin UI, not what wp-cli would see in the database.
Behavior is auditable. Because instructions live as wp_guideline posts, every prompt change is a real revision in wp_postmeta, attributable to a real user, reviewable in the standard editorial UI.

Why ship the UX mock first

This PR is intentionally a navigation point and a visual contract, nothing else. Four hard-coded Agents, a read-only Define / Tools / Triggers right pane, a "+ Create agent" button that says Coming soon. No data model writes, no login block, no real LLM call, no real trigger plumbing.

The reason is concrete: the surface area below is large, and the right argument about which slice to build first is the one that holds up to looking at the actual screen. Shipping the screen unlocks that argument without committing the team to any single backend shape. Now that the storage layer has a clear answer (wp_guideline CPT, ecosystem-compatible from day one), the next slice is obvious — but we still want the screen real before the wiring goes in.

Every architectural choice in this PR is reversible. The files added or touched can be deleted in one commit if the direction changes. What survives is the lesson: this is what it should look like.

The order we'll build it

Behavior layer — adopt wp_guideline. Each Agent gets a guideline post storing prompt + tool allowlist + skill links. This is the load-bearing decision and it goes first because every later layer references it.
Identity layer — synthetic users. Agent role(s) added, authenticate filter rejects synthetic users, password reset disabled, REST cookie auth refuses them, audit-trail attribution works end to end. The user row carries a single piece of meta linking to its guideline.
Abilities bridge. desktop_mode_ai_tools consumer that harvests wp_get_abilities() into LLM-shaped tools, capability-gated, deduped against the existing tool registry, the allowlist meta on the guideline filters down to the per-call tool set.
Push MD compatibility audit. Once 1 + 2 + 3 are in, validate that the Agent guidelines materialise correctly under wp_guideline/skills/{slug}/SKILL.md via pushmd. This is the moment Desktop Mode Agents become natively discoverable by Claude Code, Codex, and the rest of the ecosystem. We don't need to ship pushmd — we just need to not break the shape it expects.
Bindings layer — user meta scaffold. Trigger configuration, model override, rate limits land as structured user meta on the Agent's wp_users row. Nothing wired yet — just the shape, so steps 6–10 can drop in without rework.
Chat trigger. Multi-instance native window per Agent, conversation history per ( user × agent ), streaming via the existing AI Copilot endpoint, tool dispatch on the client.
Hook trigger. Subscription registry derived from the bindings meta, hook → Agent invocation, optional filter-return propagation.
REST endpoint trigger. Per-Agent route registration with auth choice (capability / nonce / anonymous), request body becomes the chat message.
Drag trigger. Agent tiles become drop targets; the cross-window drag bridge already in flight (the North Star) carries the payload.
Agent-to-agent trigger. desktop_mode_agent_completed action, chained invocations, loop detection.
Marketplace. Packaged Agent definitions third parties ship as plugins (or as Push MD guideline collections), same way block patterns work today.

Each step ships behind feature flags so plugin authors can test against trunk without us cutting a stable release until the contract settles.

Where this leads

The endgame is a WordPress site where the workflow looks like this:

The author finishes a draft, drags the post onto the Audit Agent. The Agent reviews structure, drops a note in the Reviewer queue, hands the draft to the Optimize SEO Agent on success, which proposes edits, asks for one human OK, then hands the draft to the Schedule Agent, which picks a publication time based on traffic analytics and queues the Send to mail list Agent for the moment of publication.

None of those steps need a custom plugin. They're four Agents the user composed by ticking abilities and wiring triggers, using one screen — the one this PR introduces. And because the guidelines live in wp_guideline, the same author can pushmd pull from their laptop and edit the Optimize SEO Agent's system prompt in a code editor, then push it back. Or have Claude Code in their terminal call the same Audit Agent through its REST endpoint while writing a different post. One source of truth, every surface.

What this PR is not

Not a feature flag for production users. The Agents section appears in every My WordPress window. That's deliberate — we want to learn from the question "what is this?" before we ship the working version.
Not a public API. No docs/hooks-reference.md entries, no docs/examples/agents.md. The contract is the screen. We add the API surface when the backend lands.
Not the final design. Define / Tools / Triggers is a starting point, not a finished interaction. Expect iteration on the trigger configuration UI, the abilities picker density, the chat affordance, and the empty state for "no LLM configured."

github-actions · 2026-05-18T12:28:13Z

✅ WordPress Plugin Check Report

✅ Status: Passed

📊 Report

All checks passed! No errors or warnings found.

🤖 Generated by WordPress Plugin Check Action • Learn more about Plugin Check

- Introduced a new Agents entity in the My WordPress section, including an inline SVG icon for visual consistency. - Implemented a mock renderer for the Agents section, allowing for a UI preview without backend integration. - Created mock data for four fictional agents, each with defined abilities and triggers. - Added tests to ensure the integrity of the mock data and the rendering functionality. - Updated entity registration to include the new Agents kind and adjusted related types accordingly.

- Implement unit tests for the Agents REST API in `agentsRest.php`, covering endpoints for listing, creating, updating, and deleting agents, as well as permission checks. - Create integration tests for the Agents renderer in `agents-renderer.test.ts`, ensuring proper rendering of agent data and UI interactions. - Add tests for the Agents REST adapter in `agents-rest.test.ts`, validating fetch requests and response handling. - Introduce tests for the Agents send-to functionality in `agents-send-to.test.ts`, verifying caching, menu interactions, and event dispatching.

…ging for window manager actions

Reading WordPress/desktop-mode PR #240 (Future work. Desktop Mode Agents.) revealed two architectural facts that retire v3.8.0: 1. The Anthropic provider is GENERIC infrastructure — it contains zero Signal & Noise content. It belongs in desktop-mode itself, not in our plugin. PR #240's §"LLM provider — bring your own" explicitly names Anthropic as the kind of provider "a plugin author wires up," but the better path is upstream contribution since the work has no SN-specific surface area. 2. The 26 manual `desktop_mode_register_ai_tool()` registrations we planned for Tasks 5-7 will be obsoleted by step 3 of PR #240's Agents framework, which auto-harvests `wp_register_ability()` registrations into LLM-shaped tools. Our 12 theme abilities (theme v9.1.1) + 17 plugin abilities (plugin v3.7.3) are already future-compatible — no plugin-side work needed for them to surface in the Agents framework when it lands. What this commit changes: - Deletes inc/ai-copilot/ (the 3 anthropic-* files + .gitkeep scaffold from Tasks 1-4, originally committed in 6425ab9, d3d89cc, 92e39cc, a1275b2) - Deletes tests/anthropic-provider.php (71 assertions of provider coverage, ported to the upstream PR's PHPUnit tests) - Removes the conditional require_once block from signal-and-noise-tools.php — back to its pre-Task-1 state - Annotates the v3.8.0 spec + plan with CANCELLED headers pointing to the upstream contribution path What stays: - Theme v9.1.1 + plugin v3.7.3 production state — unchanged - The 12 launcher commands in inc/desktop-mode-integration.php from commit b3430cc (display-only ⌘K entries, harmless) - All 9 legacy test suites still pass — 550 assertions across admin-tabs, ai-bootstrap, bot-detection, cron-dashboard, cron-history, health-checks, insights, theme-ability-commands, webhooks Upstream contribution work continues in the fork at juanlentino/desktop-mode (cloned to ../desktop-mode/). The provider code itself is preserved in git history at the SHAs listed above and will be ported to the PR with desktop-mode's `desktop_mode_ai_*` function-prefix conventions modeled on includes/ai-copilot/openai.php. Reference: WordPress/desktop-mode#240 WordPress/desktop-mode#271

AllTerrainDeveloper added the DO NOT MERGE label May 18, 2026

WordPress deleted a comment from claude Bot May 21, 2026

AllTerrainDeveloper force-pushed the my-agents branch from 3b2cf2d to 53212e6 Compare May 21, 2026 18:09

AllTerrainDeveloper added 3 commits May 22, 2026 20:37

fix(agent-run-window): improve entity tile click handling and add log…

13f830f

…ging for window manager actions

AllTerrainDeveloper force-pushed the my-agents branch from 53212e6 to 13f830f Compare May 22, 2026 18:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MOCK][DO NOT MERGE] Future work. Desktop Mode Agents.#240

[MOCK][DO NOT MERGE] Future work. Desktop Mode Agents.#240
AllTerrainDeveloper wants to merge 3 commits into
trunkfrom
my-agents

AllTerrainDeveloper commented May 18, 2026 •

edited by github-actions Bot

Loading

Uh oh!

github-actions Bot commented May 18, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

AllTerrainDeveloper commented May 18, 2026 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

The pitch

What an Agent is — three layers, by design

Layer 1 — Identity: a WordPress user

Layer 2 — Behavior: the wp_guideline CPT (portable) @artpi

Layer 3 — Bindings: user meta on the Agent (site-specific)

The split, in one line

Why this split matters: free ecosystem compatibility

How you talk to an Agent — five triggers, one mental model

Tools = the WordPress Abilities API

LLM provider — bring your own

Drag-and-drop is the North Star

Security model — boring on purpose

Why ship the UX mock first

The order we'll build it

Where this leads

What this PR is not

Uh oh!

github-actions Bot commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ WordPress Plugin Check Report

📊 Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

AllTerrainDeveloper commented May 18, 2026 •

edited by github-actions Bot

Loading

Layer 2 — Behavior: the `wp_guideline` CPT (portable) @artpi

github-actions Bot commented May 18, 2026 •

edited

Loading