Qirabot Python SDK

Official Python SDK for Qirabot - AI-powered device automation platform.

Automate mobile and web devices with natural language or structured actions. Let AI see the screen, click, type, extract data, and verify results.

Installation

pip install qirabot

Requires Python 3.10+.

Configuration

Sign up at qirabot.com and get your API key from the dashboard.
Set it as an environment variable:

export QIRA_API_KEY="qk_your_api_key"

import os
from qirabot import Qirabot

bot = Qirabot(os.environ["QIRA_API_KEY"])

Two Modes

The SDK offers two execution modes:

	Interactive Mode	Submit Mode
How	`bot.tasks.session()`	`bot.tasks.submit()`
Connection	WebSocket — real-time step events	REST — fire and poll
Control	Run Python logic between steps	Define all actions upfront
Best for	Conditional workflows, debugging, data pipelines	Background jobs, CI/CD, simple automations

Quick Start

Interactive Mode

Step-by-step control with real-time feedback via WebSocket:

import os
from qirabot import Qirabot

bot = Qirabot(os.environ["QIRA_API_KEY"])

with bot.tasks.session("device-id", name="wiki-extract") as s:
    s.navigate("https://en.wikipedia.org")
    s.type_text("Search Wikipedia input", "Artificial intelligence")
    s.click("Search button")

    if s.wait_for("Wikipedia article page is visible", timeout_ms=10000):
        summary = s.extract("Get the first paragraph of the article")
        print(f"Summary: {summary}")
    else:
        print("Page did not load in time")

Submit Mode

Fire-and-forget execution with polling:

import os
from qirabot import Qirabot, Action

bot = Qirabot(os.environ["QIRA_API_KEY"])

# Single AI instruction — let AI handle the entire workflow
task_id = bot.tasks.submit(
    "device-id",
    name="hn-top-stories",
    instruction="Go to news.ycombinator.com, extract the top 3 story titles and their scores",
)
result = bot.tasks.wait(task_id, timeout=120)
print(f"Status: {result.status}, Steps: {len(result.steps)}")

# Composed actions — precise control over each step
task_id = bot.tasks.submit("device-id", name="github-trending", actions=[
    Action.navigate("https://github.com/trending"),
    Action.wait_for("Trending repositories page is loaded"),
    Action.extract("Get the names and descriptions of the top 5 trending repositories", variable="repos"),
    Action.take_screenshot(),
])
result = bot.tasks.wait(task_id)
for step in result.steps:
    if step.output:
        print(step.output)

Screenshot Mode

Control how screenshots are stored and returned via screenshot_mode:

Mode	Description
`"cloud"`	Store to cloud, return URL path (default)
`"inline"`	No cloud storage, return binary data via WebSocket
`"none"`	No screenshots stored or returned

# Submit mode
task_id = bot.tasks.submit(
    "device-id",
    instruction="Open the homepage",
    screenshot_mode="inline",
)

# Interactive mode
with bot.tasks.session("device-id", screenshot_mode="none") as s:
    s.click("Login button")

Annotated Screenshots

Set annotate=True to overlay red crosshair markers on screenshots at the coordinates where each action was performed. Useful for debugging and visual verification.

# Submit mode
task_id = bot.tasks.submit(
    "device-id",
    instruction="Click the login button",
    annotate=True,
)

# Interactive mode
with bot.tasks.session("device-id", annotate=True) as s:
    s.click("Login button")

Post-Action Delay

Add a delay after actions to wait for slow-loading UIs or rate-limited pages.

Task-level — applies to all actions in the task via post_action_delay_ms:

# Submit mode — 2s delay after every action
task_id = bot.tasks.submit(
    "device-id",
    instruction="Fill out the registration form",
    post_action_delay_ms=2000,
)

# Interactive mode
with bot.tasks.session("device-id", post_action_delay_ms=1500) as s:
    s.click("Login button")
    s.type_text("Username field", "hello")

Action-level — override the task-level delay for a specific action:

# Interactive mode — use the wait_time_ms keyword argument
with bot.tasks.session("device-id", post_action_delay_ms=1000) as s:
    s.click("Submit button")                          # uses task-level 1s
    s.click("Next page", wait_time_ms=5000)           # uses 5s
    s.click("Confirm button")                         # uses task-level 1s

# Submit mode — set wait_time_ms in action params
task_id = bot.tasks.submit("device-id", post_action_delay_ms=1000, actions=[
    Action.click("Submit button"),                                        # uses task-level 1s
    Action(type="click", params={"locate": "Next page", "wait_time_ms": 5000}),  # uses 5s
    Action.click("Confirm button"),                                       # uses task-level 1s
])

Priority: action-level wait_time_ms > task-level post_action_delay_ms > client local config.

Note: Delays only apply to UI interaction actions (click, type_text, navigate, scroll, etc.). Non-UI actions like take_screenshot, wait_for etc. automatically skip the delay — setting wait_time_ms on them is silently ignored.

Download Screenshots

Download screenshots from completed tasks (only available with "cloud" mode):

# Download a single step screenshot
bot.tasks.screenshot(task_id, step=1, path="step_1.png")

# Get screenshot bytes without saving to file
img_bytes = bot.tasks.screenshot(task_id, step=1)

# Download all screenshots as a ZIP archive
bot.tasks.screenshots(task_id, path="task_images.zip")

Device Management

List connected devices:

# List all devices
devices = bot.devices.list()
for d in devices:
    print(f"{d.name} ({d.id}): {d.platform}, online={d.online}")

# List online devices only
active = bot.devices.list_active()

Sandbox Management

List, inspect, resume, and suspend cloud sandboxes:

# List all sandboxes
sandboxes = bot.sandboxes.list()
for sb in sandboxes:
    print(f"{sb.name} ({sb.id}): {sb.status}, device={sb.device_id}")

# Get sandbox status
sb = bot.sandboxes.get("sandbox-id")
print(f"Status: {sb.status}")

# Resume a suspended sandbox before running tasks
sb = bot.sandboxes.resume("sandbox-id")

# Suspend a sandbox to save resources
sb = bot.sandboxes.suspend("sandbox-id")

Actions

For the full list of actions and platform support, see the Actions Reference.

Events

Listen for real-time step and screenshot events in interactive mode:

from qirabot import StepEvent, ScreenshotEvent

with bot.tasks.session("device-id") as s:
    s.on("step", lambda e: print(f"Step {e.number}: {e.action} -> {e.status}"))
    s.on("screenshot", lambda e: e.save(f"screenshots/step_{e.number}.png"))

    s.click("Login button")

StepEvent Fields

Field	Type	Description
`number`	`int`	Step number
`action`	`str`	Action type (e.g. `click`, `type_text`, `ai_decision`)
`status`	`str`	`succeeded` or `failed`
`output`	`str \| None`	Action output (e.g. extracted text)
`decision`	`str \| None`	AI reasoning text
`error`	`str \| None`	Error message if failed
`params`	`dict \| None`	Action parameters (includes `x`, `y` coordinates when available)
`action_duration_time_ms`	`int`	Device-side action execution time
`step_duration_ms`	`int`	Total step wall-clock time
`input_tokens`	`int`	LLM input tokens consumed
`output_tokens`	`int`	LLM output tokens consumed
`thinking_tokens`	`int`	LLM thinking tokens consumed
`cache_read_tokens`	`int`	Prompt cache read tokens
`cache_write_tokens`	`int`	Prompt cache write tokens

Computed properties:

total_tokens — input_tokens + output_tokens
coordinate — (x, y) tuple extracted from params, or None

ScreenshotEvent Fields

Field	Type	Description
`number`	`int`	Step number this screenshot belongs to
`task_id`	`str`	Task ID
`data`	`bytes \| None`	Raw PNG bytes (inline mode), `None` for cloud mode

Methods:

save(local_path) — save the screenshot to a local file (works for both inline and cloud mode)
to_bytes() — return raw PNG bytes (downloads from server in cloud mode)

Tracking Coordinates and Token Usage

from qirabot import StepEvent

def on_step(event: StepEvent):
    icon = "+" if event.status == "succeeded" else "x"
    print(f"[{icon}] Step {event.number}: {event.action} ({event.step_duration_ms}ms)")
    if coord := event.coordinate:
        print(f"    coordinate: ({coord[0]}, {coord[1]})")
    if event.total_tokens > 0:
        print(f"    tokens: in={event.input_tokens} out={event.output_tokens} "
              f"thinking={event.thinking_tokens} "
              f"cache_read={event.cache_read_tokens} cache_write={event.cache_write_tokens}")

with bot.tasks.session("device-id") as s:
    s.on("step", on_step)
    s.ai("Search for 'Python SDK' and click the first result")

Workflow Integration

Interactive mode keeps the device connection alive across steps. You can run your own Python code between any two device actions — read files, validate data, branch on results, write reports:

import json
import os
from qirabot import Qirabot

bot = Qirabot(os.environ["QIRA_API_KEY"])

# Test data — ready to run, no extra files needed
test_cases = [
    {"url": "https://github.com",       "expect_keyword": "GitHub"},
    {"url": "https://www.wikipedia.org", "expect_keyword": "Wikipedia"},
    {"url": "https://news.ycombinator.com", "expect_keyword": "Hacker News"},
]

results = []

with bot.tasks.session("device-id", name="heading-check") as s:
    for case in test_cases:
        # Device: navigate to the page
        s.navigate(case["url"])
        if not s.wait_for("Page has finished loading", timeout_ms=15000):
            results.append({"url": case["url"], "heading": "", "passed": False})
            print(f"FAIL {case['url']} -> page did not load")
            continue

        # Device: extract the page heading
        heading = s.extract("Get the main heading or site title text")

        # Your code: validate between steps
        passed = case["expect_keyword"].lower() in heading.lower()
        results.append({"url": case["url"], "heading": heading, "passed": passed})
        print(f"{'PASS' if passed else 'FAIL'} {case['url']} -> {heading}")

        # Conditional: screenshot only on failure
        if not passed:
            s.take_screenshot(path=f"{case['expect_keyword']}_fail.png")

print(json.dumps(results, indent=2, ensure_ascii=False))

Submit mode returns structured step results for post-processing:

import csv

task_id = bot.tasks.submit(
    "device-id",
    name="product-hunt-scrape",
    instruction="Go to producthunt.com, extract the top 5 products with their names and taglines",
)
result = bot.tasks.wait(task_id)

with open("steps.csv", "w", newline="") as f:
    writer = csv.DictWriter(f, fieldnames=["step", "action", "status", "duration_ms"])
    writer.writeheader()
    for step in result.steps:
        writer.writerow({
            "step": step.number,
            "action": step.action,
            "status": step.status,
            "duration_ms": step.step_duration_ms,
        })

Error Handling

from qirabot import ActionError, QirabotTimeoutError, DeviceOfflineError

try:
    with bot.tasks.session("device-id", name="error-demo") as s:
        s.navigate("https://httpstat.us/500")
        s.verify("Page shows a success message")
except ActionError as e:
    print(f"Action failed: {e}")
except DeviceOfflineError:
    print("Device is offline")
except QirabotTimeoutError:
    print("Operation timed out")

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
.github/workflows		.github/workflows
src/qirabot		src/qirabot
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Qirabot Python SDK

Installation

Configuration

Two Modes

Quick Start

Interactive Mode

Submit Mode

Screenshot Mode

Annotated Screenshots

Post-Action Delay

Download Screenshots

Device Management

Sandbox Management

Actions

Events

StepEvent Fields

ScreenshotEvent Fields

Tracking Coordinates and Token Usage

Workflow Integration

Error Handling

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Qirabot Python SDK

Installation

Configuration

Two Modes

Quick Start

Interactive Mode

Submit Mode

Screenshot Mode

Annotated Screenshots

Post-Action Delay

Download Screenshots

Device Management

Sandbox Management

Actions

Events

StepEvent Fields

ScreenshotEvent Fields

Tracking Coordinates and Token Usage

Workflow Integration

Error Handling

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages