[MOCK][DO NOT MERGE] Future work. Desktop Mode Agents.#240
Open
AllTerrainDeveloper wants to merge 3 commits into
Open
[MOCK][DO NOT MERGE] Future work. Desktop Mode Agents.#240AllTerrainDeveloper wants to merge 3 commits into
AllTerrainDeveloper wants to merge 3 commits into
Conversation
Contributor
✅ WordPress Plugin Check Report
📊 ReportAll checks passed! No errors or warnings found. 🤖 Generated by WordPress Plugin Check Action • Learn more about Plugin Check |
3b2cf2d to
53212e6
Compare
- Introduced a new Agents entity in the My WordPress section, including an inline SVG icon for visual consistency. - Implemented a mock renderer for the Agents section, allowing for a UI preview without backend integration. - Created mock data for four fictional agents, each with defined abilities and triggers. - Added tests to ensure the integrity of the mock data and the rendering functionality. - Updated entity registration to include the new Agents kind and adjusted related types accordingly.
- Implement unit tests for the Agents REST API in `agentsRest.php`, covering endpoints for listing, creating, updating, and deleting agents, as well as permission checks. - Create integration tests for the Agents renderer in `agents-renderer.test.ts`, ensuring proper rendering of agent data and UI interactions. - Add tests for the Agents REST adapter in `agents-rest.test.ts`, validating fetch requests and response handling. - Introduce tests for the Agents send-to functionality in `agents-send-to.test.ts`, verifying caching, menu interactions, and event dispatching.
…ging for window manager actions
53212e6 to
13f830f
Compare
juanlentino
added a commit
to juanlentino/signal-and-noise-tools
that referenced
this pull request
May 24, 2026
Reading WordPress/desktop-mode PR #240 (Future work. Desktop Mode Agents.)
revealed two architectural facts that retire v3.8.0:
1. The Anthropic provider is GENERIC infrastructure — it contains zero
Signal & Noise content. It belongs in desktop-mode itself, not in
our plugin. PR #240's §"LLM provider — bring your own" explicitly
names Anthropic as the kind of provider "a plugin author wires up,"
but the better path is upstream contribution since the work has no
SN-specific surface area.
2. The 26 manual `desktop_mode_register_ai_tool()` registrations we
planned for Tasks 5-7 will be obsoleted by step 3 of PR #240's
Agents framework, which auto-harvests `wp_register_ability()`
registrations into LLM-shaped tools. Our 12 theme abilities (theme
v9.1.1) + 17 plugin abilities (plugin v3.7.3) are already
future-compatible — no plugin-side work needed for them to surface
in the Agents framework when it lands.
What this commit changes:
- Deletes inc/ai-copilot/ (the 3 anthropic-* files + .gitkeep
scaffold from Tasks 1-4, originally committed in 6425ab9,
d3d89cc, 92e39cc, a1275b2)
- Deletes tests/anthropic-provider.php (71 assertions of provider
coverage, ported to the upstream PR's PHPUnit tests)
- Removes the conditional require_once block from
signal-and-noise-tools.php — back to its pre-Task-1 state
- Annotates the v3.8.0 spec + plan with CANCELLED headers pointing
to the upstream contribution path
What stays:
- Theme v9.1.1 + plugin v3.7.3 production state — unchanged
- The 12 launcher commands in inc/desktop-mode-integration.php from
commit b3430cc (display-only ⌘K entries, harmless)
- All 9 legacy test suites still pass — 550 assertions across
admin-tabs, ai-bootstrap, bot-detection, cron-dashboard,
cron-history, health-checks, insights, theme-ability-commands,
webhooks
Upstream contribution work continues in the fork at
juanlentino/desktop-mode (cloned to ../desktop-mode/). The provider
code itself is preserved in git history at the SHAs listed above and
will be ported to the PR with desktop-mode's `desktop_mode_ai_*`
function-prefix conventions modeled on includes/ai-copilot/openai.php.
Reference: WordPress/desktop-mode#240
WordPress/desktop-mode#271
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The pitch
A WordPress site is a tool the user operates. Agents make it a tool that operates with the user: durable, addressable workers that live on the site, take orders by chat, drag, hook, or HTTP, and use the same APIs a human admin would.
This PR ships the navigation surface and visual contract. Everything described below is what we build on top of it, in the order described, behind this UI.
What an Agent is — three layers, by design
We split an Agent across three existing WordPress primitives instead of inventing a new one. This is the architectural decision that lets every other piece compose cleanly.
Layer 1 — Identity: a WordPress user
Each Agent is a real row in
wp_userswith three constraints:administrator,editor,author,contributor). Capabilities are real WP capabilities; an Agent that runswp_insert_post()is gated bycurrent_user_can( 'edit_posts' )exactly like a human.authenticatefilter rejects them, password resets are disabled, cookie auth refuses them. They exist only as actors the site invokes on its own behalf._edit_lockand revision author show the Agent — just like a human collaborator.Layer 2 — Behavior: the
wp_guidelineCPT (portable) @artpiThe Agent's
wp_guidelinepost IS its behavior. Every field that shapes how the Agent thinks, every toggle that changes what it can do, every list of supplemental knowledge it can reach for — all of it reads from and writes to a singlewp_guidelinepost. There is no parallel Desktop-Mode table for prompts, no separate options row for the tool list, no shadow registry for skills. One post per Agent, and that post is the agent definition.This is the CPT the broader Automattic agent ecosystem (Dolly, Push MD, the in-tree Guidelines experiment) already uses, which is why this works.
Concretely, every behavior-shaping change in the Desktop Mode UI lands as a write to this one post:
wp_update_post()on the guideline'spost_contentwordpress/list-posts)add_post_meta( $guideline_id, '_agent_abilities', 'wordpress/list-posts' )delete_post_meta( $guideline_id, '_agent_abilities', 'wordpress/list-posts' )writing/headline-style)wp_guidelineto the parent — Dolly's existing relationship model, no new schemawp_update_post()on the child guideline (the skill is its own post; the agent just references it)Nothing about the agent's brain lives outside
wp_guideline. Pull up the agent's guideline post in any tool that speaks WP REST (Gutenberg, wp-cli, Push MD, an external script) and you have the entire behavior surface — prompt, tool toggles, attached skills — editable in one place, revisable through the standard editorial UI, auditable via the standard revisions table.And because skills are themselves
wp_guidelineposts, this composes recursively: a skill can be used by many agents; an agent can mix-and-match skills from many authors; a single skill update propagates to every agent referencing it. Same pattern Dolly ships.This layer is fully portable. Nothing in any of these fields is Desktop-Mode-specific — every value is something Claude Code, Codex, Cursor, or any other agent runtime understands natively.
pushmd pullthe site and the agent's brain materialises into the consumer'sskills/folder verbatim, tool toggles and all.Layer 3 — Bindings: user meta on the Agent (site-specific)
Everything about how this site invokes the Agent lives as user meta on the Agent's
wp_usersrow. The fields here are intentionally outsidewp_guidelinebecause they would be meaningless to consumers that aren't Desktop Mode:save_post,wp_insert_comment, …), which REST endpoint it exposes and under what auth, which drop payloads its tile accepts, which agents it chains to.This is the layer Claude Code in a terminal doesn't have an opinion about — it invokes agents directly, no hook subscription, no REST gateway. Keeping bindings out of the guideline means the guideline travels, the bindings don't: clone the site, get the agent's brain; configure how your site invokes it separately. Same agent definition, different invocation policy per environment.
The split, in one line
Updates to bindings don't touch behavior; updates to behavior don't touch identity; rename the identity and the other two don't move. And critically: if a user clicks anything in the agent's Define / Tools / Skills UI, the only thing that changes on disk is one
wp_guidelinepost.Why this split matters: free ecosystem compatibility
Because behavior lives in
wp_guideline, every Agent we ship is automatically discoverable by any AI client that already speaks this CPT — including Claude Code, Codex, and anything in the Automattic agent ecosystem. The mechanism (already in production via pushmd.blog): your WordPress site becomes a Git remote, every guideline materialises aswp_guideline/skills/{slug}/SKILL.mdwith anAGENTS.mdalias, and a localgit cloneof the site drops a workingskills/folder into the consumer's checkout. Claude Code reads it. Codex reads it. Cursor reads it. No bespoke integration, no separate sync layer, no second source of truth.The agent you build inside Desktop Mode shows up in your terminal the moment you
pushmd pull. The agent your developer hand-writes as a.mdfile in their checkout shows up in Desktop Mode the moment theygit push. Same artifact, two front-ends.How you talk to an Agent — five triggers, one mental model
Every interaction with an Agent is a trigger the user configures up front. The five triggers we plan to ship:
save_post,wp_insert_comment, …)POST /agents/v1/<slug>All five collapse to the same loop: a message arrives → the Agent's system prompt + the message become an LLM call → the model picks tools off the allowlist → tools run as the Agent's user → the result is the trigger's return value. Drag-and-drop is a chat with a media payload. A hook subscription is a chat where the message is the hook args. An endpoint is a chat where the body is the message. Same engine, different intake.
Trigger configuration is user meta on the Agent (Layer 3), not part of the guideline. Two reasons: (1) triggers are site-specific — the same "Moderate Comments" guideline can sit on one site that listens to
wp_insert_commentand another that only exposes the REST endpoint, with no fork of the underlying agent definition; (2) triggers are a Desktop-Mode concept that wouldn't round-trip cleanly through pushmd / Claude Code / Codex anyway, so they don't belong in the layer those tools consume.Tools = the WordPress Abilities API
WordPress 6.9 introduced
wp_register_ability()— Core's first-party way to expose typed, schema-described actions to AI tooling. Agents read that registry at runtime: every ability becomes a candidate tool with its declaredparametersschema converted to the OpenAI / Anthropic / Gemini function-calling shape. The user picks which ones each Agent gets, and the picks live as post meta on the guideline.Three properties that fall out of this:
permission_callback. The Agent runs as itself (its WP user), so the same checks that protect a human editor protect the Agent.wp_get_abilities(). As Core grows that registry, every Agent's picker grows with it.For abilities not yet exposed by Core or plugins, the existing
desktop_mode_register_ai_tool()registry plugs the gap and feeds the same picker.This Abilities-to-tools mapping is the piece the broader ecosystem doesn't have yet. It's what turns "an agent that knows things" (skills, instructions) into "an agent that does things" (tool calls against a typed schema).
LLM provider — bring your own
Agents need a model. We already ship a provider registry (
desktop_mode_register_ai_provider) that supports OpenAI's Responses API today and is structured for Anthropic, Gemini, and any vendor a plugin author wires up. Two implications for Agents:Drag-and-drop is the North Star
This is the feature that proves the desktop metaphor isn't a skin. "Drag this image onto the Remove BG agent" is a sentence non-technical users say out loud and expect to work. The cross-window drag bridge needed for Media-into-Gutenberg is the same machinery needed for tile-into-Agent — so every Agent we ship pulls the bridge closer to finished.
What it requires on the Agent side: an accepted-payload manifest in the drag-trigger binding (MIME types, entity kinds, post types) and a drop handler that converts the dropped payload into the chat message format. The framework already knows how to draw the ghost, hit-test windows, and route the drop — Agents are just one more drop target type.
Security model — boring on purpose
A new actor that can take HTTP requests, listen to hooks, and call tools is a security surface. The boring-on-purpose answer:
wp_guidelineposts, every prompt change is a real revision inwp_postmeta, attributable to a real user, reviewable in the standard editorial UI.Why ship the UX mock first
This PR is intentionally a navigation point and a visual contract, nothing else. Four hard-coded Agents, a read-only Define / Tools / Triggers right pane, a "+ Create agent" button that says Coming soon. No data model writes, no login block, no real LLM call, no real trigger plumbing.
The reason is concrete: the surface area below is large, and the right argument about which slice to build first is the one that holds up to looking at the actual screen. Shipping the screen unlocks that argument without committing the team to any single backend shape. Now that the storage layer has a clear answer (
wp_guidelineCPT, ecosystem-compatible from day one), the next slice is obvious — but we still want the screen real before the wiring goes in.Every architectural choice in this PR is reversible. The files added or touched can be deleted in one commit if the direction changes. What survives is the lesson: this is what it should look like.
The order we'll build it
wp_guideline. Each Agent gets a guideline post storing prompt + tool allowlist + skill links. This is the load-bearing decision and it goes first because every later layer references it.authenticatefilter rejects synthetic users, password reset disabled, REST cookie auth refuses them, audit-trail attribution works end to end. The user row carries a single piece of meta linking to its guideline.desktop_mode_ai_toolsconsumer that harvestswp_get_abilities()into LLM-shaped tools, capability-gated, deduped against the existing tool registry, the allowlist meta on the guideline filters down to the per-call tool set.wp_guideline/skills/{slug}/SKILL.mdvia pushmd. This is the moment Desktop Mode Agents become natively discoverable by Claude Code, Codex, and the rest of the ecosystem. We don't need to ship pushmd — we just need to not break the shape it expects.wp_usersrow. Nothing wired yet — just the shape, so steps 6–10 can drop in without rework.( user × agent ), streaming via the existing AI Copilot endpoint, tool dispatch on the client.desktop_mode_agent_completedaction, chained invocations, loop detection.Each step ships behind feature flags so plugin authors can test against trunk without us cutting a stable release until the contract settles.
Where this leads
The endgame is a WordPress site where the workflow looks like this:
None of those steps need a custom plugin. They're four Agents the user composed by ticking abilities and wiring triggers, using one screen — the one this PR introduces. And because the guidelines live in
wp_guideline, the same author canpushmd pullfrom their laptop and edit the Optimize SEO Agent's system prompt in a code editor, then push it back. Or have Claude Code in their terminal call the same Audit Agent through its REST endpoint while writing a different post. One source of truth, every surface.What this PR is not
docs/hooks-reference.mdentries, nodocs/examples/agents.md. The contract is the screen. We add the API surface when the backend lands.