Skip to content

qirabot/python-sdk

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Qirabot Python SDK

Official Python SDK for Qirabot - AI-powered device automation platform.

Automate mobile and web devices with natural language or structured actions. Let AI see the screen, click, type, extract data, and verify results.

Installation

pip install qirabot

Requires Python 3.10+.

Configuration

  1. Sign up at qirabot.com and get your API key from the dashboard.
  2. Set it as an environment variable:
export QIRA_API_KEY="qk_your_api_key"
import os
from qirabot import Qirabot

bot = Qirabot(os.environ["QIRA_API_KEY"])

Two Modes

The SDK offers two execution modes:

Interactive Mode Submit Mode
How bot.tasks.session() bot.tasks.submit()
Connection WebSocket — real-time step events REST — fire and poll
Control Run Python logic between steps Define all actions upfront
Best for Conditional workflows, debugging, data pipelines Background jobs, CI/CD, simple automations

Quick Start

Interactive Mode

Step-by-step control with real-time feedback via WebSocket:

import os
from qirabot import Qirabot

bot = Qirabot(os.environ["QIRA_API_KEY"])

with bot.tasks.session("device-id", name="wiki-extract") as s:
    s.navigate("https://en.wikipedia.org")
    s.type_text("Search Wikipedia input", "Artificial intelligence")
    s.click("Search button")

    if s.wait_for("Wikipedia article page is visible", timeout_ms=10000):
        summary = s.extract("Get the first paragraph of the article")
        print(f"Summary: {summary}")
    else:
        print("Page did not load in time")

Submit Mode

Fire-and-forget execution with polling:

import os
from qirabot import Qirabot, Action

bot = Qirabot(os.environ["QIRA_API_KEY"])

# Single AI instruction — let AI handle the entire workflow
task_id = bot.tasks.submit(
    "device-id",
    name="hn-top-stories",
    instruction="Go to news.ycombinator.com, extract the top 3 story titles and their scores",
)
result = bot.tasks.wait(task_id, timeout=120)
print(f"Status: {result.status}, Steps: {len(result.steps)}")

# Composed actions — precise control over each step
task_id = bot.tasks.submit("device-id", name="github-trending", actions=[
    Action.navigate("https://github.com/trending"),
    Action.wait_for("Trending repositories page is loaded"),
    Action.extract("Get the names and descriptions of the top 5 trending repositories", variable="repos"),
    Action.take_screenshot(),
])
result = bot.tasks.wait(task_id)
for step in result.steps:
    if step.output:
        print(step.output)

Screenshot Mode

Control how screenshots are stored and returned via screenshot_mode:

Mode Description
"cloud" Store to cloud, return URL path (default)
"inline" No cloud storage, return binary data via WebSocket
"none" No screenshots stored or returned
# Submit mode
task_id = bot.tasks.submit(
    "device-id",
    instruction="Open the homepage",
    screenshot_mode="inline",
)

# Interactive mode
with bot.tasks.session("device-id", screenshot_mode="none") as s:
    s.click("Login button")

Annotated Screenshots

Set annotate=True to overlay red crosshair markers on screenshots at the coordinates where each action was performed. Useful for debugging and visual verification.

# Submit mode
task_id = bot.tasks.submit(
    "device-id",
    instruction="Click the login button",
    annotate=True,
)

# Interactive mode
with bot.tasks.session("device-id", annotate=True) as s:
    s.click("Login button")

Post-Action Delay

Add a delay after actions to wait for slow-loading UIs or rate-limited pages.

Task-level — applies to all actions in the task via post_action_delay_ms:

# Submit mode — 2s delay after every action
task_id = bot.tasks.submit(
    "device-id",
    instruction="Fill out the registration form",
    post_action_delay_ms=2000,
)

# Interactive mode
with bot.tasks.session("device-id", post_action_delay_ms=1500) as s:
    s.click("Login button")
    s.type_text("Username field", "hello")

Action-level — override the task-level delay for a specific action:

# Interactive mode — use the wait_time_ms keyword argument
with bot.tasks.session("device-id", post_action_delay_ms=1000) as s:
    s.click("Submit button")                          # uses task-level 1s
    s.click("Next page", wait_time_ms=5000)           # uses 5s
    s.click("Confirm button")                         # uses task-level 1s

# Submit mode — set wait_time_ms in action params
task_id = bot.tasks.submit("device-id", post_action_delay_ms=1000, actions=[
    Action.click("Submit button"),                                        # uses task-level 1s
    Action(type="click", params={"locate": "Next page", "wait_time_ms": 5000}),  # uses 5s
    Action.click("Confirm button"),                                       # uses task-level 1s
])

Priority: action-level wait_time_ms > task-level post_action_delay_ms > client local config.

Note: Delays only apply to UI interaction actions (click, type_text, navigate, scroll, etc.). Non-UI actions like take_screenshot, wait_for etc. automatically skip the delay — setting wait_time_ms on them is silently ignored.

Download Screenshots

Download screenshots from completed tasks (only available with "cloud" mode):

# Download a single step screenshot
bot.tasks.screenshot(task_id, step=1, path="step_1.png")

# Get screenshot bytes without saving to file
img_bytes = bot.tasks.screenshot(task_id, step=1)

# Download all screenshots as a ZIP archive
bot.tasks.screenshots(task_id, path="task_images.zip")

Device Management

List connected devices:

# List all devices
devices = bot.devices.list()
for d in devices:
    print(f"{d.name} ({d.id}): {d.platform}, online={d.online}")

# List online devices only
active = bot.devices.list_active()

Sandbox Management

List, inspect, resume, and suspend cloud sandboxes:

# List all sandboxes
sandboxes = bot.sandboxes.list()
for sb in sandboxes:
    print(f"{sb.name} ({sb.id}): {sb.status}, device={sb.device_id}")

# Get sandbox status
sb = bot.sandboxes.get("sandbox-id")
print(f"Status: {sb.status}")

# Resume a suspended sandbox before running tasks
sb = bot.sandboxes.resume("sandbox-id")

# Suspend a sandbox to save resources
sb = bot.sandboxes.suspend("sandbox-id")

Actions

For the full list of actions and platform support, see the Actions Reference.

Events

Listen for real-time step and screenshot events in interactive mode:

from qirabot import StepEvent, ScreenshotEvent

with bot.tasks.session("device-id") as s:
    s.on("step", lambda e: print(f"Step {e.number}: {e.action} -> {e.status}"))
    s.on("screenshot", lambda e: e.save(f"screenshots/step_{e.number}.png"))

    s.click("Login button")

StepEvent Fields

Field Type Description
number int Step number
action str Action type (e.g. click, type_text, ai_decision)
status str succeeded or failed
output str | None Action output (e.g. extracted text)
decision str | None AI reasoning text
error str | None Error message if failed
params dict | None Action parameters (includes x, y coordinates when available)
action_duration_time_ms int Device-side action execution time
step_duration_ms int Total step wall-clock time
input_tokens int LLM input tokens consumed
output_tokens int LLM output tokens consumed
thinking_tokens int LLM thinking tokens consumed
cache_read_tokens int Prompt cache read tokens
cache_write_tokens int Prompt cache write tokens

Computed properties:

  • total_tokensinput_tokens + output_tokens
  • coordinate(x, y) tuple extracted from params, or None

ScreenshotEvent Fields

Field Type Description
number int Step number this screenshot belongs to
task_id str Task ID
data bytes | None Raw PNG bytes (inline mode), None for cloud mode

Methods:

  • save(local_path) — save the screenshot to a local file (works for both inline and cloud mode)
  • to_bytes() — return raw PNG bytes (downloads from server in cloud mode)

Tracking Coordinates and Token Usage

from qirabot import StepEvent

def on_step(event: StepEvent):
    icon = "+" if event.status == "succeeded" else "x"
    print(f"[{icon}] Step {event.number}: {event.action} ({event.step_duration_ms}ms)")
    if coord := event.coordinate:
        print(f"    coordinate: ({coord[0]}, {coord[1]})")
    if event.total_tokens > 0:
        print(f"    tokens: in={event.input_tokens} out={event.output_tokens} "
              f"thinking={event.thinking_tokens} "
              f"cache_read={event.cache_read_tokens} cache_write={event.cache_write_tokens}")

with bot.tasks.session("device-id") as s:
    s.on("step", on_step)
    s.ai("Search for 'Python SDK' and click the first result")

Workflow Integration

Interactive mode keeps the device connection alive across steps. You can run your own Python code between any two device actions — read files, validate data, branch on results, write reports:

import json
import os
from qirabot import Qirabot

bot = Qirabot(os.environ["QIRA_API_KEY"])

# Test data — ready to run, no extra files needed
test_cases = [
    {"url": "https://github.com",       "expect_keyword": "GitHub"},
    {"url": "https://www.wikipedia.org", "expect_keyword": "Wikipedia"},
    {"url": "https://news.ycombinator.com", "expect_keyword": "Hacker News"},
]

results = []

with bot.tasks.session("device-id", name="heading-check") as s:
    for case in test_cases:
        # Device: navigate to the page
        s.navigate(case["url"])
        if not s.wait_for("Page has finished loading", timeout_ms=15000):
            results.append({"url": case["url"], "heading": "", "passed": False})
            print(f"FAIL {case['url']} -> page did not load")
            continue

        # Device: extract the page heading
        heading = s.extract("Get the main heading or site title text")

        # Your code: validate between steps
        passed = case["expect_keyword"].lower() in heading.lower()
        results.append({"url": case["url"], "heading": heading, "passed": passed})
        print(f"{'PASS' if passed else 'FAIL'} {case['url']} -> {heading}")

        # Conditional: screenshot only on failure
        if not passed:
            s.take_screenshot(path=f"{case['expect_keyword']}_fail.png")

print(json.dumps(results, indent=2, ensure_ascii=False))

Submit mode returns structured step results for post-processing:

import csv

task_id = bot.tasks.submit(
    "device-id",
    name="product-hunt-scrape",
    instruction="Go to producthunt.com, extract the top 5 products with their names and taglines",
)
result = bot.tasks.wait(task_id)

with open("steps.csv", "w", newline="") as f:
    writer = csv.DictWriter(f, fieldnames=["step", "action", "status", "duration_ms"])
    writer.writeheader()
    for step in result.steps:
        writer.writerow({
            "step": step.number,
            "action": step.action,
            "status": step.status,
            "duration_ms": step.step_duration_ms,
        })

Error Handling

from qirabot import ActionError, QirabotTimeoutError, DeviceOfflineError

try:
    with bot.tasks.session("device-id", name="error-demo") as s:
        s.navigate("https://httpstat.us/500")
        s.verify("Page shows a success message")
except ActionError as e:
    print(f"Action failed: {e}")
except DeviceOfflineError:
    print("Device is offline")
except QirabotTimeoutError:
    print("Operation timed out")

License

MIT

About

Official Python SDK for Qirabot - AI-powered device automation

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages