Skip to content

Latest commit

 

History

History
323 lines (249 loc) · 8.33 KB

File metadata and controls

323 lines (249 loc) · 8.33 KB

ByeFast — AI Agent Skill Card

Give this file to your LLM agent so it knows exactly how to use ByeFast.

What ByeFast Is

A headless browser engine exposed as a local HTTP API on port 8741. You use it to:

  • Load web pages and read their content as structured JSON
  • Find elements by natural-language intent (no CSS selectors needed)
  • Click, type, scroll, and interact with pages
  • Execute JavaScript in an isolated context
  • Save and restore full page state snapshots

Base URL

http://localhost:8741

Core Workflow

1. Spawn a page

POST /pages
Content-Type: application/json

{
  "url": "https://example.com",
  "idle_timeout_ms": 600000
}

Response:

{
  "page_id": "abc123...",
  "url": "https://example.com",
  "status": "spawned"
}

Save page_id. You need it for every subsequent call.


2. Load content into the page

Option A — Navigate (async, fetches URL automatically):

POST /pages/{page_id}/navigate
Content-Type: application/json

{"url": "https://example.com"}

Wait 2–3 seconds, then check semantic-action-map. If elements are empty, use Option B.

Option B — Inject HTML directly (synchronous, always works):

POST /pages/{page_id}/load-html
Content-Type: application/json

{"html": "<html><body><h1>Title</h1><a href='/about'>About</a></body></html>"}

Response: {"elements_parsed": 14, "status": "ok"}

Use Option B when you already have the HTML, or when navigate fails (bot-protected sites).


3. Read the page — Semantic Action Map

GET /pages/{page_id}/semantic-action-map

Response:

{
  "page_id": "...",
  "url": "https://example.com",
  "title": "Example Domain",
  "elements": [
    {
      "node_id": 5,
      "tag": "button",
      "role": "button",
      "label": "Add to cart",
      "actions": ["click"],
      "disabled": false,
      "visible": true
    },
    {
      "node_id": 8,
      "tag": "input",
      "role": "textbox",
      "label": "Search",
      "current_value": "",
      "actions": ["type"],
      "disabled": false,
      "visible": true
    }
  ],
  "navigation_links": [
    {"text": "About", "href": "/about", "node_id": 12}
  ],
  "form_state": {
    "email": "",
    "password": ""
  }
}

This is your primary interface. Read this before deciding what to do next.


4. Find elements by intent (no CSS needed)

POST /pages/{page_id}/vision/reindex

(Call once after loading HTML, then find freely)

POST /pages/{page_id}/vision/find
Content-Type: application/json

{"intent": "add to cart button"}

Response:

{
  "node_id": 5,
  "tag": "button",
  "text": "Add to Cart",
  "label": "button",
  "confidence": 0.82
}

Use the returned node_id for actions. If confidence < 0.3, the element wasn't found.

POST /pages/{page_id}/vision/find-interactive
Content-Type: application/json

{"intent": "email input field"}

Same response — only searches interactive elements (inputs, buttons, links).


5. Perform actions

POST /pages/{page_id}/actions
Content-Type: application/json

{"action": "click", "node_id": 5}
POST /pages/{page_id}/actions
Content-Type: application/json

{"action": "type", "node_id": 8, "value": "hello@example.com"}
POST /pages/{page_id}/actions
Content-Type: application/json

{"action": "scroll", "x": 0, "y": 500}
POST /pages/{page_id}/actions
Content-Type: application/json

{"action": "hover", "node_id": 5}

Supported action types: click type scroll hover select


6. Execute JavaScript

POST /pages/{page_id}/eval
Content-Type: application/json

{
  "script": "document.title()",
  "timeout_ms": 3000
}

Response:

{"result": "Example Domain", "success": true, "duration_ms": 0}

Available DOM bindings in the JS context:

document.title()                    // → "Page Title"
document.querySelector("h1")       // → { id, tagName, innerText }
document.querySelectorAll("a")     // → [{ id, tagName, innerText }, ...]
document.open()                     // start HTML write buffer
document.write("<html>...")         // append HTML
document.close()                    // flush buffer → live DOM

7. Save and restore state (Web Time-Travel)

POST /pages/{page_id}/snapshots

Response: {"snapshot_id": "...", "compressed_bytes": 1234}

POST /pages/{page_id}/snapshots/{snapshot_id}

Restores the page to exact saved state. Useful for backtracking.


Decision Flow

Load page
    ↓
GET /semantic-action-map
    ↓
Is the target element visible?
    ├── YES → use node_id directly for action
    └── NO  → POST /vision/reindex → POST /vision/find {"intent": "..."}
                  ↓
              confidence > 0.3?
                  ├── YES → use node_id for action
                  └── NO  → element not found, try different intent words
                             or POST /eval to inspect DOM manually

Error Codes

HTTP Meaning Fix
404 Page ID not found Spawn a new page
500 Script eval failed Check JS syntax; document.write needs open() first
410 Page was destroyed Spawn a new page

Tips for Agents

  • Read semantic-action-map first — never guess node_id values; always read them from the map or vision/find
  • Confidence threshold — treat confidence < 0.3 as "not found"; 0.5+ is reliable; 0.8+ is high confidence
  • Page stays alive — pages auto-suspend after idle timeout but resume instantly; don't destroy and respawn unless you need a clean state
  • Snapshot before risky actions — save a snapshot before submitting forms or clicking destructive buttons so you can backtrack
  • load-html > navigate — if you have the HTML, inject it directly; it's synchronous and always works

Full Route Reference

POST   /pages                               Spawn page
GET    /pages                               List all pages
GET    /pages/{id}                          Page status
DELETE /pages/{id}                          Destroy page
POST   /pages/{id}/navigate                 Navigate to URL
POST   /pages/{id}/load-html               Inject HTML directly
POST   /pages/{id}/resume                  Resume suspended page
POST   /pages/{id}/clock/advance           Advance virtual clock

GET    /pages/{id}/semantic-action-map     Primary read interface
POST   /pages/{id}/actions                 Perform action
POST   /pages/{id}/eval                    Execute JavaScript
GET    /pages/{id}/console                 Console output log
GET    /pages/{id}/trace                   Full event log

POST   /pages/{id}/vision/reindex          Rebuild element index
POST   /pages/{id}/vision/find             Find element by intent
POST   /pages/{id}/vision/find-interactive Find interactive element

GET    /pages/{id}/capabilities            List capabilities
POST   /pages/{id}/capabilities/request   Request capability grants

POST   /pages/{id}/snapshots               Save state
GET    /pages/{id}/snapshots               List snapshots
POST   /pages/{id}/snapshots/{snap_id}    Restore state

POST   /pages/{id}/audit/session                        Start audit session
GET    /pages/{id}/audit/session/{sid}                  Session summary
DELETE /pages/{id}/audit/session/{sid}                  End session
POST   /pages/{id}/audit/session/{sid}/record           Record action
GET    /pages/{id}/audit/session/{sid}/prove/{aid}      Merkle proof
GET    /pages/{id}/audit/session/{sid}/log              Full audit log

GET    /compositor/pages                   List virtual pages
POST   /compositor/pages                   Create virtual page
GET    /compositor/pages/{vp_id}           Query virtual page
DELETE /compositor/pages/{vp_id}           Destroy virtual page
POST   /compositor/pages/{vp_id}/slots     Mount page into slot
DELETE /compositor/pages/{vp_id}/slots/{slot}  Unmount slot
POST   /compositor/pages/{vp_id}/actions   Cross-slot action

POST   /pages/{id}/evolve/repair/{api}    Repair broken polyfill
POST   /pages/{id}/evolve/selector        Evolve broken selector
GET    /pages/{id}/evolve/health          Evolution health report

GET    /health                             Engine health check
GET    /stats                             Active pages, storage, metrics