Skip to content

computer tool: schema omits half its input fields (scroll/drag/resize/select_menu/window_screenshot/wait_for/set_brightness/perform_action unusable) + element_at AppleScript reserved-word bug #350

Description

@1jehuang

Summary

Exhaustively exercising every computer tool action surfaced a class of bugs that make a large fraction of the documented actions impossible to invoke from the model, plus one AppleScript bug. The handlers are implemented correctly; the problem is the function-calling contract (parameters_schema()) and one reserved-word collision.

Branch: feat/macos-computer-tool (running binary 0.24.3-computer).

Root cause 1 — schema omits half the input fields (blocks 8 actions)

ComputerInput deserializes many fields (dx, dy, to_x, to_y, w, h, menu_path, window_id, ax_action, contains, timeout_ms, region, level), and dispatch() requires them. But parameters_schema() uses "progressive disclosure" and only declares a subset of properties. Models emit only parameters present in the tool's JSON schema, so these fields can never be sent. Every dependent action dead-ends in its own .context("... requires X") guard:

action required field (undeclared) observed error
scroll dx/dy action='scroll' requires non-zero dx and/or dy
drag to_x/to_y action='drag' requires to_x and to_y
resize_window w/h resize_window requires w
select_menu menu_path select_menu requires menu_path
window_screenshot window_id window_screenshot requires window_id
wait_for contains wait_for requires contains
set_brightness level set_brightness requires level (0..1)
perform_action ax_action (unreachable)
ocr (region) region (region variant unreachable; full-screen ok)

Progressive disclosure of action specs (the discover text) is fine, but input fields must be declared or the action is unusable.

Root cause 2 — element_at uses AppleScript reserved word result

ax.rs::element_at assigns to a handler variable named result. result is reserved in AppleScript (holds the last command's value), so the script fails:

Deepest element at (1100,400) in System Settings:
(error: The variable result is not defined.)

Exhaustive test matrix (before fix)

Working: check_permissions, discover, screenshot, ocr (full screen), ui, cursor, move, click, double_click, right_click, type, key, key_down, key_up, find_element, list_apps, list_windows, activate_app, hide_app(n/a), focus_window, move_window, get_clipboard, set_clipboard, run_applescript, run_jxa, notify, system_state.

Broken: scroll, drag, resize_window, select_menu, window_screenshot, wait_for, set_brightness, perform_action, element_at (+ ocr region variant).

Fix

  1. Declare all input fields in parameters_schema() (keep action-spec text in discover). Bump the schema_is_compact bound to match the slightly larger but still small schema.
  2. Rename the element_at AppleScript local result -> bestHit.

PR incoming.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions