AskUI Vision Agent provides three predefined agent types for different automation targets. All agents share the same core API (act(), get(), locate()) but are optimized for their respective platforms. Each agent comes with its own system prompt tailored to its platform-specific tools and capabilities.
Use this agent for desktop automation on Windows, macOS, and Linux. Uses AskUI Agent OS to control mouse, keyboard, and capture screenshots.
from askui import ComputerAgent
with ComputerAgent() as agent:
agent.act("Open the mail app and summarize all unread emails")Default tools: screenshot, mouse_click, mouse_move, mouse_scroll, mouse_hold_down, mouse_release, type, keyboard_tap, keyboard_pressed, keyboard_release, get_mouse_position, get_system_info, list_displays, retrieve_active_display, set_active_display
Use this agent for automation of Android devices via ADB. Supports tapping, swiping, typing, and shell commands.
from askui import AndroidAgent
with AndroidAgent(device=0) as agent:
agent.tap("Login button")
agent.swipe(start=(500, 1000), end=(500, 300))
agent.act("Navigate to settings and enable notifications")Requires the android dependency installed (pip install askui[android]) and a connected device (physical or emulator).
Default tools: screenshot, tap, type, swipe, drag_and_drop, key_tap_event, key_combination, shell, select_device_by_serial_number, select_display_by_unique_id, get_connected_devices_serial_numbers, get_connected_displays_infos, get_current_connected_device_infos
For web browser automation using Playwright. Extends ComputerAgent with web-specific tools like navigation, URL handling, and page title retrieval.
from askui import WebVisionAgent
with WebVisionAgent() as agent:
agent.tools.os.goto("https://example.com")
agent.click("Sign In")
agent.act("Fill out the contact form and submit")Default tools: All ComputerAgent tools plus goto, back, forward, get_page_title, get_page_url
| Target | Agent | Backend |
|---|---|---|
| Desktop (Windows/macOS/Linux) | ComputerAgent |
AskUI Agent OS (gRPC) |
| Android devices | AndroidAgent |
ADB |
| Web browsers | WebVisionAgent |
Playwright |