Skip to content

browser_type and browser_click inside a canvas. #161

@anishsane

Description

@anishsane

This was seen when viewing a VNC session inside a canvas element.

Since the entire 'VNC video' is a single canvas element, we cannot visually click the elements inside the canvas using this MCP. A similar problem when an image tag has <area> tags defined with it. We cannot click at specific location within the image.

Since browser_click is element based, can we trigger a mouse_click event at specific coordinates within an element?
I tried VNC-mcp (which has such support), but it did not handle the image scaling correctly.

Similarly, the canvas element does not support typing directly. So, AI agent falls back to individual press_key operations, which become very slow.
I am not sure if this can be optimized to send the string faster. Maybe the MCP server can receive the string together and perform press_key internally.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions