Skip to content

Refactor AWT widget tree to playwright-mcp-style ref model#152

Draft
yzx9 wants to merge 5 commits into
mainfrom
awt-refs
Draft

Refactor AWT widget tree to playwright-mcp-style ref model#152
yzx9 wants to merge 5 commits into
mainfrom
awt-refs

Conversation

@yzx9

@yzx9 yzx9 commented Jun 22, 2026

Copy link
Copy Markdown
Collaborator

No description provided.

@yzx9 yzx9 mentioned this pull request Jun 22, 2026
yzx9 added 2 commits June 22, 2026 21:43
Replace the positional action_id/snapshot_id scheme and the tree-connector
text format with a playwright-mcp-style accessibility snapshot.

- Each window and leaf widget carries a stable [refeN] handle, assigned
  via a new session-scoped ComponentIdentifier (sibling to WindowIdentifier,
  keyed on the live AWT Component in a WeakHashMap). Labels and
  intermediate containers are not ref-eligible, so display refs stay
  contiguous; the same element keeps its ref across snapshots and a
  removed element yields a clean "ref not found" error (no mis-targeting).
- Snapshots render as 2-space-indented YAML: `- role "name" [refeN]
  (actions): state`, with each widget's short action ids inline
  (click, selectItem, setState, setValue, ...).
- Actions are invoked as call_action(ref, action, params) on the latest
  snapshot, dispatching by short id. Removed Action.path,
  Snapshot.actions, buildActions(), and the path-based runAction.
- Per-component action metadata is captured pre-deactivation and
  serialized on each component so deactivated snapshots still describe.
- ScrollbarNode gained a setValue action (AdjustmentEvent); fixed the
  IjTextWindow get_results_table/getResultsTable casing bug.

Scope: plugin layer only (Python copilotj/plugin/awt + Java
plugin/.../awt + API/CLI). The leader/agent tool layer is unchanged;
call_action stays dormant.

Java compiles; 160 Python tests pass (7 new format tests); ruff clean.
The AWT refactor (previous commit) replaced the integer action-id model
(snapshot_id + action_id indexing a top-level Snapshot.actions[]) with a
Playwright-MCP-style ref model: actions live per-component
(component.actions[]) and are addressed by the component's stable ref handle
plus the action's short id (Action.shortId(), e.g. "click", "setState").

EventHandler dispatches run_action by deserializing the payload into
SnapshotManager.ActionRequest via Jackson, so the old MCP tool silently sent
snapshot_id/action_id that mapped to nothing -> null ref/action -> runtime
"Ref null not found" error. No compile error, no git conflict.

- CallActionTool: inputSchema + payload now use ref (string) + action (string);
  description updated to the ref/short-id model.
- TakeSnapshotTool: description notes per-component actions + ref handles.
- mcp_smoke_test: find_action_ref walks windows[].children[] to resolve a
  (ref, action_short_id); call_action invoked with ref/action.
yzx9 added 3 commits June 23, 2026 12:37
…e tests

MCP tool metadata (the surface clients actually consume):
- CallActionTool/TakeSnapshotTool descriptions now teach the
  snapshot->ref->action->re-snapshot loop; the `parameters` schema field
  explains positional args instead of the bare "Action parameters".
- McpModule.instructions mentions the UI loop.

Smoke test (scripts/mcp_smoke_test.py):
- Shared _walk_components walker (DRY) with window-title/label scoping.
- New checks: ref stability, stale-ref (close window -> "not found"),
  round-trip (flip/verify/restore), unknown-action error, plus tool-metadata
  assertions that catch a description regression.

CI tests (no Fiji needed):
- test_mcp_smoke_helpers.py: fixture-JSON coverage for the walker/collectors.
- test_action_response.py: locks in the TypedActionResponse contract.

Fix call_action response-type regression (from /review + Codex):
- Action.Response carries the fully-qualified action type again
  (ComponentNode.getActions() + Snapshot.resolveActionType), not the short
  id, so the Python bridge's TypedActionResponse validates once more.
- TypedActionResponse gains the missing ScrollbarSetValueResponse and
  ListSelectResponse branches.

Fix null-param NPE in ListNode/ChoiceNode runAction error messages.

AWT EDT threading violation (off-EDT mutation across run_action/run_macro/
run_script/capture) deferred to a holistic pass; RFC at
.plans/0003-awt-edt-threading.md.
fiji://windows and fiji://environment returned the same JSON as the existing take_snapshot and fiji_environment tools, so the resource layer is redundant. Remove WindowsResource/EnvironmentResource and their McpModule registration; the smoke test now expects no resources. take_snapshot still returns JSON in this step. MCP surface is now 9 tools, 0 resources, 2 prompts.
Captures the deferred design for making the Java MCP take_snapshot emit the same compact YAML-like text tree the Python path produces via Response._describe(), instead of raw JSON. Covers the SnapshotFormatter design, the MCP-only rendering seam (EventHandler keeps JSON for the bridge), the exact YAML grammar mirrored from test_awt_snapshot.py, the smoke-test YAML-parsing consequence, two-phase window flattening, and JUnit 5 verification. Status: RFC deferred.
@yzx9 yzx9 added enhancement New feature or request javascript Pull requests that update javascript code java Pull requests that update java code and removed javascript Pull requests that update javascript code labels Jun 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request java Pull requests that update java code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant