Skip to content

Latest commit

 

History

History
640 lines (475 loc) · 23.3 KB

File metadata and controls

640 lines (475 loc) · 23.3 KB

ByeFast — Technical Specification

Version: 0.1.0
Stability: Alpha
Language: Rust 2021 edition (MSRV 1.75)


1. Design Goals

ByeFast is not a general-purpose browser. It is an AI-agent execution substrate with the following non-negotiable properties:

  1. Determinism — Identical inputs must always produce identical outputs, regardless of wall-clock time, GC pressure, or network jitter.
  2. Minimal footprint — Each idle page must consume ≤ 8 MB RAM and ≤ 0 CPU. The cold-start path from process launch to first accepted HTTP request must complete in ≤ 4 ms.
  3. Semantic transparency — All page content must be expressible as typed Rust structs that an LLM can reason about directly. No pixel buffers, no screenshots as primary interface.
  4. Composability — Every engine is an independent crate communicating through a single typed channel. Engines can be replaced, mocked, or omitted.
  5. Safety — A crash, infinite loop, or memory exhaustion in one page must not affect any other page.

2. Workspace Structure

byefast/
├── Cargo.toml                 # Workspace manifest
├── crates/
│   ├── bye-event/             # EventBus and Event enum
│   ├── bye-kernel/            # Page lifecycle and scheduling
│   ├── bye-dom/               # DOM store and HTML parser
│   ├── bye-net/               # HTTP/2 network client
│   ├── bye-loader/            # Resource loading orchestration
│   ├── bye-js/                # JavaScript runtime
│   ├── bye-layout/            # Box-model layout engine
│   ├── bye-render/            # Semantic diff engine
│   ├── bye-storage/           # Persistent state
│   ├── bye-safe/              # Security and anti-bot layer
│   ├── bye-api/               # HTTP control surface
│   ├── bye-vision/            # Visual-semantic element matcher
│   ├── bye-compositor/        # Cross-origin page synthesis
│   ├── bye-audit/             # Cryptographic audit log
│   └── bye-evolve/            # Self-healing shim generator
├── bin/byefast/               # Binary entry point
└── tests/                     # Integration tests

Dependency Rules

  • bye-event has no workspace dependencies (foundation crate).
  • bye-kernel depends only on bye-event.
  • bye-dom depends only on bye-event.
  • bye-net depends on bye-event.
  • bye-loader depends on bye-event, bye-net.
  • bye-js depends on bye-event, bye-dom, bye-kernel.
  • bye-api depends on all engine crates.
  • No circular dependencies.

3. EventBus

3.1 Design

All inter-engine communication flows through a single EventBus instance, passed by Arc clone at startup. The bus is built on tokio::sync::broadcast with a channel capacity of 4096 events.

pub struct EventBus {
    tx: broadcast::Sender<TimestampedEvent>,
    metrics: Arc<EventMetrics>,
}

Subscribers call bus.subscribe_filtered(|e| matches!(e, Event::PageSpawned { .. })) to receive only events they care about. Each subscription is a cheap broadcast::Receiver — there is no per-subscriber heap allocation beyond the receiver handle.

3.2 Payload

Large data (HTML bodies, binary blobs, screenshots) is transported as Payload:

pub struct Payload(pub Bytes);

Payload wraps bytes::Bytes, which is an Arc<[u8]> with an offset. Cloning a Payload is one atomic increment. There is no memcpy when passing HTML between the loader, JS runtime, and DOM parser.

3.3 Event Enum

pub enum Event {
    // Page lifecycle
    PageSpawned   { page_id: PageId, origin: OriginId },
    PageSuspended { page_id: PageId },
    PageResumed   { page_id: PageId },
    PageDestroyed { page_id: PageId, reason: DestroyReason },

    // Network pipeline
    ResourceRequested { page_id: PageId, url: Arc<str>, priority: u8, resource_type: ResourceType },
    ResourceReady     { page_id: PageId, url: Arc<str>, data: Payload, content_type: Arc<str>, status: u16 },
    ResourceFailed    { page_id: PageId, url: Arc<str>, error: Arc<str> },
    GhostMapReady     { page_id: PageId, semantic_map: Payload },

    // DOM
    DomMutation     { page_id: PageId, patch: Payload },
    DomQueryRequest { page_id: PageId, query_id: Uuid, selector: Arc<str> },
    DomQueryResult  { page_id: PageId, query_id: Uuid, node_ids: Payload },
    ElementIntentRequest { page_id: PageId, query_id: Uuid, intent: Arc<str> },
    ElementIntentResult  { page_id: PageId, query_id: Uuid, node_id: Option<u32>, confidence: f32 },

    // JavaScript
    ScriptEvalRequest { page_id: PageId, eval_id: Uuid, script: Arc<str> },
    ScriptEvalResult  { page_id: PageId, eval_id: Uuid, result: Payload, success: bool },
    PolyfillRequested { page_id: PageId, api_name: Arc<str> },
    PolyfillLoaded    { page_id: PageId, api_name: Arc<str> },

    // Rendering
    SemanticDiff { page_id: PageId, diff: Payload },

    // Web Time-Travel
    StateSaveRequest    { page_id: PageId, snapshot_id: SnapshotId },
    StateSaved          { page_id: PageId, snapshot_id: SnapshotId, compressed_bytes: u64 },
    StateRestoreRequest { page_id: PageId, snapshot_id: SnapshotId },
    StateRestored       { page_id: PageId, snapshot_id: SnapshotId },

    // Agent actions
    AgentAction { page_id: PageId, action: Payload },

    // Security
    CapabilityGranted { page_id: PageId, capability: Arc<str> },
    CapabilityDenied  { page_id: PageId, capability: Arc<str>, reason: Arc<str> },
    SecurityViolation { page_id: PageId, description: Arc<str> },

    // Kernel metrics
    MemoryPressure   { page_id: PageId, used_bytes: usize, limit_bytes: usize },
    CpuQuotaExceeded { page_id: PageId, cpu_ms_used: u64, cpu_ms_quota: u64 },
    MetricsExport    { page_id: PageId, metrics: Payload },
    JsCrash          { message: String },
}

4. bye-kernel — Page Lifecycle

4.1 PageState Machine

                ┌──────────┐
     spawn()    │          │   freeze_clock()
   ──────────►  │ Running  │ ──────────────────► Executing JS
                │          │ ◄──────────────────
                └────┬─────┘  thaw_clock()
                     │  idle timeout (30s default)
                     ▼
                ┌──────────┐
                │Suspended │
                └────┬─────┘
                     │  resume() or eval request
                     ▼
                ┌──────────┐
                │ Running  │
                └────┬─────┘
                     │  destroy() or JsCrash
                     ▼
                ┌──────────┐
                │Destroyed │
                └──────────┘

4.2 VirtualClock

Each page owns a VirtualClock that tracks simulated time in milliseconds. The clock can be in two states:

  • Running: advances at wall-clock speed
  • Frozen: returns the same value for every query, regardless of elapsed real time

The kernel freezes the clock on ScriptEvalRequest and thaws it on ScriptEvalResult. This ensures Date.now() is stable for an entire JS evaluation cycle.

pub struct VirtualClock {
    base_real_ms: u64,      // wall-clock ms when clock started
    virtual_offset_ms: i64, // accumulated time-travel offset
    frozen_at_ms: Option<u64>,
}

4.3 Scheduling

The kernel runs a background tokio task that ticks every 100 ms. On each tick it:

  1. Checks CPU usage per page against quotas (default 100 ms/s)
  2. Emits CpuQuotaExceeded if over budget
  3. Checks idle time per page; suspends pages idle for > 30s
  4. Emits MetricsExport with { memory_used_bytes, cpu_ms_total, uptime_ms, js_crash_count } per page

5. bye-dom — Document Object Model

5.1 Arena Storage

The DOM is a flat Vec<Node> (an arena allocator). NodeId is a u32 index. Deleted slots are recycled through a free-list (Vec<NodeId>).

pub struct Document {
    arena:     Vec<Option<Node>>,
    free_list: Vec<NodeId>,
    root:      NodeId,  // always 0
}

pub struct Node {
    pub node_type:   NodeType,
    pub name:        Arc<str>,       // tag name (lowercase) or "#text"
    pub attrs:       Vec<(Arc<str>, Arc<str>)>,
    pub text:        Option<Arc<str>>,
    pub parent:      NodeId,
    pub first_child: NodeId,
    pub last_child:  NodeId,
    pub next_sibling: NodeId,
    pub prev_sibling: NodeId,
}

Why a flat Vec? Random access is O(1) with no pointer chasing. Cache locality is excellent since all nodes are contiguous. Tree traversal is a tight loop over NodeId indices. The entire DOM of example.com fits in under 8 KB.

5.2 Built-in HTML Parser

Document::from_html(html: &str) -> Document — a hand-written tokenizer + stack-based tree builder with no external dependencies.

Tokenizer output:

enum HtmlToken {
    Doctype,
    OpenTag  { name: String, attrs: Vec<(String, String)>, self_closing: bool },
    CloseTag { name: String },
    Text(String),
    Comment(String),
}

Parser rules:

  • <script> and <style> content is skipped (pushed as sentinel u32::MAX on stack)
  • Void elements (br, img, input, etc.) are never pushed to stack
  • Attribute values are decoded: &amp; &lt; &gt; &quot; &nbsp; &#39;
  • Tag names are lowercased
  • Malformed nesting is handled by searching back through stack for matching open tag

5.3 IntentEngine

IntentEngine::resolve(doc, intent) -> Option<(NodeId, f32)> finds the element that best matches a natural-language intent string.

Scoring factors (weighted sum → clamped to [0.0, 1.0]):

Factor Weight Method
ARIA role match 0.35 role attribute or implicit role from tag
Visible text match 0.30 Token overlap with intent words
Label/placeholder match 0.20 aria-label, placeholder, title attributes
Tag semantics 0.15 Button/input/link bonus

Elements scoring below 0.2 are never returned.


6. bye-net — Network Stack

6.1 HTTP Client

Built on reqwest 0.12 with:

  • TLS via rustls + WebPKI root CAs (bundled, no OS cert store dependency)
  • HTTP/1.1 and HTTP/2 via ALPN negotiation (not forced H2)
  • Brotli, gzip, deflate decompression
  • Per-origin semaphore concurrency limiting (default: 6 concurrent per origin)
  • 90-second idle connection pool timeout

6.2 Response Cache

In-memory LRU cache with a configurable TTL (default: 300 seconds). Only safe, idempotent methods (GET, HEAD, OPTIONS) are cached. Cache keys are the full URL string.

6.3 Request Deduplication

Concurrent identical safe requests are deduplicated via an Inflight map (DashMap<Arc<str>, Arc<Notify>>). The first request fetches; subsequent identical concurrent requests wait for the first to complete and share the result.


7. bye-loader — Resource Orchestration

7.1 Priority DAG

Each page has a directed acyclic graph of resources. Nodes are ResourceDescriptor { url, resource_type, priority }. Edges represent dependencies (e.g. JS depends on CSS which depends on HTML).

ResourceType: Document | Stylesheet | Script | Image | Font | Fetch | WebSocket | Other
Priority:     Critical(10) | High(7) | Medium(5) | Low(3) | Idle(1)

Resources are fetched in topological order. Resources at the same dependency depth are fetched concurrently up to the per-origin limit.

7.2 Ghost Mode

On load_page(page_id, url), the loader first attempts GET {origin}/.well-known/semantic-map.json. If the server responds with valid JSON matching the SemanticMap schema, the loader emits GhostMapReady and returns immediately — no HTML fetch, no JS execution.

Ghost Mode semantic map schema:

{
  "title": "string",
  "elements": [
    {
      "role": "button | link | textbox | ...",
      "label": "string",
      "action": "string (URL or method)",
      "interactable": true
    }
  ],
  "navigation": [{ "text": "string", "href": "string" }]
}

7.3 navigate Flow

POST /pages/{id}/navigate
  │
  ├── background tokio::spawn:
  │     1. loader.net().fetch(id, GET url)
  │     2. if OK: Document::from_html(body) → write to state.doms[id]
  │     3. loader.discover_resources(id, &html, &url)  // regex scan for sub-resources
  │     4. loader.load_page(id, &url)                  // fetch sub-resources via DAG
  │
  └── return immediately: { "status": "loading" }

Agents should poll semantic-action-map or use load-html for synchronous injection.


8. bye-js — JavaScript Runtime

8.1 Engine Choice

boa_engine — a pure-Rust ECMAScript engine. Selected over QuickJS (requires C patch utility on Windows) and V8 (heavy C++ build dependency).

Trade-off: boa_engine is slower than V8/QuickJS for compute-heavy JS. ByeFast workloads are not compute-heavy — they call a few DOM APIs and return. This is an acceptable trade-off.

8.2 Thread Model

boa_engine::Context is !Send (uses Rc internally for its GC). Each page gets a dedicated OS thread that owns its Context for its lifetime. The async façade communicates via bounded std::sync::mpsc channels. Evaluation is submitted via tokio::task::spawn_blocking.

Tokio async task              OS thread (per page)
─────────────────             ────────────────────
eval(script) ──── mpsc::send ──► run_in_context(script)
await result ◄─── mpsc::send ─── result

8.3 Deterministic Overrides

All time and randomness APIs are overridden at context creation:

// Overridden in JS at context init:
Date.now = () => __bye_clock_ms;           // frozen VirtualClock value
performance.now = () => __bye_clock_ms;
Math.random = () => __bye_rng_next();      // page-ID-seeded PRNG

8.4 Polyfill Registry

HashMap<&'static str, &'static str>  // api_name → JS source

Registered polyfills:

  • fetchpolyfills/fetch.js
  • URLpolyfills/url.js
  • TextEncoderpolyfills/encoding.js
  • TextDecoderpolyfills/encoding.js
  • structuredClonepolyfills/structured_clone.js

When eval() returns a ReferenceError matching a polyfill name, the shim is injected and the script is retried once.

8.5 DOM Bindings

The document object is constructed as a boa_engine native object using ObjectInitializer. All closures access the DOM through a thread_local! { PAGE_DOM: RefCell<Option<Arc<RwLock<Document>>>> } — safe because each page has exactly one JS thread.

document.querySelector(selector)    → JsObject { id, tagName, innerText } | null
document.querySelectorAll(selector) → JsArray of node objects
document.open()                     → starts WRITE_BUFFER
document.write(html)                → appends to WRITE_BUFFER
document.close()                    → Document::from_html(buffer) → replaces PAGE_DOM
document.title()                    → string | ""

9. bye-vision — Visual-Semantic Matching

9.1 Feature Vectors

Each indexed element is represented as a FeatureVec([f32; 64]) built deterministically from:

  • Tag identity (one-hot encoding over common tags)
  • Role (from role attribute or implicit tag role)
  • Text hash (character n-gram fingerprint of visible text)
  • Depth in DOM tree (normalised)
  • Interactivity flags (is button, is input, has href, has onclick)
  • Attribute presence (id, class, aria-label, placeholder, title)

9.2 Index

VisualMatcher maintains a DashMap<PageId, Vec<IndexedElement>> where each IndexedElement stores { node_id, feature_vec, label, tag, text }.

reindex(page_id, &doc) walks the entire DOM arena and builds the index from scratch. Called automatically after load-html and can be triggered via API.

9.3 Search

find(page_id, intent) builds a query vector from the intent string using the same embedding process as indexing, then performs linear cosine similarity scan over all indexed elements. Returns the highest-scoring match above threshold 0.1.

pub struct MatchResult {
    pub node_id:    u32,
    pub tag:        String,
    pub text:       String,
    pub label:      String,
    pub confidence: f32,
}

10. bye-audit — Cryptographic Audit Log

10.1 Action Records

Each ActionRecord contains:

pub struct ActionRecord {
    pub id:          Uuid,
    pub page_id:     PageId,
    pub session_id:  Uuid,
    pub action_type: String,
    pub url:         String,
    pub node_id:     Option<u32>,
    pub value:       Option<String>,
    pub categories:  Vec<String>,
    pub success:     bool,
    pub timestamp_ms: u64,
    pub hash:        [u8; 32],   // SHA-256 of canonical JSON
}

10.2 Merkle Tree

On prove_action(session_id, action_id), all actions in the session are sorted by timestamp, hashed individually (SHA-256), and assembled into a binary Merkle tree. The function returns:

{
  "root_hex": "...",
  "action_id": "...",
  "merkle_proof": {
    "leaf_hex": "...",
    "path": ["sibling_hash_hex", ...]
  }
}

Verification: recompute the leaf hash from the action record, walk the path, check the root matches.


11. bye-safe — Security Layer

11.1 Origin Sandbox

Each page is created with an origin (extracted from the spawn URL). Cross-origin resource requests are logged as SecurityViolation events. The sandbox does not block requests (ByeFast is not a user-facing browser) but provides full visibility.

11.2 Capability System

pub enum Capability {
    Clipboard, Geolocation, MediaDevices, Notifications,
    PersistentStorage, PopupWindows, WebAssembly,
    WebGL, WebAudio, IndexedDb, SemanticActionMap,
    Screenshot, Custom(String),
}

Agents request capabilities via POST /pages/{id}/capabilities/request. ByeSafe evaluates the request against per-origin policies (currently allow-all by default). Denied capabilities emit CapabilityDenied events.

11.3 HumanitySimulator

Mouse path generation uses a cubic Bézier curve with a randomly placed control point offset from the straight line, and speed variation derived from a Gaussian distribution seeded by page_id. Keystroke timings use inter-key delay distributions fit to empirical human typing data (mean ~200ms, σ ~60ms at 60 WPM).


12. bye-compositor — Cross-Origin Synthesis

A VirtualPageId maps to a set of SlotConfig { slot_name, page_id, selector_filter } entries. ByeCompositor::query(vp_id) calls the semantic-action-map logic on each mounted page and merges results, prefixing element IDs with the slot name to avoid collisions.

register_live_dom(page_id, Arc<RwLock<Document>>, url) allows the compositor to read real DOM data directly without going through the HTTP API.


13. bye-evolve — Self-Healing

13.1 Polyfill Repair

observe_failure(page_id, api_name, current_shim, error) records the failure. repair_polyfill(page_id, api_name) returns a patched shim string by applying heuristic transformations to the failing shim source (currently: wrapping in try-catch, adding undefined guards).

13.2 Selector Evolution

register_anchor(page_id, selector, node_id, &doc) stores the last-known-good node for a selector. evolve_selector(page_id, selector, &doc) searches for the anchor node in the current DOM (by visible text, aria-label, role match) and returns the evolved selector string.

pub struct SelectorEvolution {
    pub original_selector: String,
    pub evolved_selector:  String,
    pub node_id:           u32,
    pub confidence:        f32,
    pub page_id:           PageId,
}

14. bye-api — HTTP Control Surface

14.1 Server

axum 0.7 with tower-http CORS and tracing middleware. Listens on 0.0.0.0:8741 by default.

14.2 AppState

pub struct AppState {
    pub kernel:     Arc<Kernel>,
    pub loader:     Arc<ByeLoader>,
    pub js:         Arc<ByeJs>,
    pub doms:       DashMap<PageId, Arc<RwLock<Document>>>,
    pub safe:       Arc<ByeSafe>,
    pub storage:    Arc<ByeStorage>,
    pub net:        Arc<ByeNet>,          // via loader.net()
    pub bus:        EventBus,
    pub vision:     Arc<VisualMatcher>,
    pub compositor: Arc<ByeCompositor>,
    pub audit:      Arc<AuditLog>,
    pub evolve:     Arc<ByeEvolve>,
    pub page_urls:  DashMap<PageId, String>,
    pub traces:     DashMap<PageId, RwLock<Vec<serde_json::Value>>>,
}

14.3 Flight Recorder

On page spawn, a background task subscribes to all EventBus events for that page and appends them as JSON to state.traces[page_id]. GET /pages/{id}/trace returns the full array — useful for debugging why DOM is empty or why a fetch failed.

14.4 semantic-action-map Response

{
  "page_id": "...",
  "url": "https://example.com",
  "title": "Example Domain",
  "elements": [
    {
      "node_id": 5,
      "tag": "button",
      "role": "button",
      "label": "Submit",
      "interactable": true,
      "attrs": { "type": "submit" }
    }
  ],
  "navigation_links": [
    { "text": "About", "href": "/about", "node_id": 12 }
  ],
  "form_state": {
    "email": { "value": "", "type": "email" }
  }
}

15. Cold-Start Budget

Target: ≤ 4 ms from process launch to first accepted TCP connection.

Step Budget
tracing subscriber init ~0.1 ms
EventBus construction ~0.0 ms
Kernel::new + scheduler task ~0.2 ms
ByeNet::new (TLS config, rustls session cache) ~1.5 ms
ByeStorage::open_ephemeral ~0.1 ms
ByeJs::new ~0.3 ms
axum::serve TCP bind + first accept() ~0.1 ms
Total ~2.4 ms

Measured on a warm Linux 6.x kernel with a modern CPU. Windows adds ~0.5–1.0 ms for TCP stack initialisation.


16. Memory Budget Per Page

Target: ≤ 8 MB per idle page.

Component Typical
DOM arena (200-node page) ~40 KB
Vision index (200 nodes × 64 f32) ~50 KB
JS context (boa_engine, idle) ~2 MB
EventBus receiver handle ~1 KB
Trace buffer (100 events) ~30 KB
Stack (OS thread for JS) ~2 MB (default)
Total ~4.1 MB

A suspended page yields its JS thread stack back to the OS after the context is parked, reducing to ~300 KB.


17. Known Limitations (v0.1.0)

  1. boa_engine performance — 10–100× slower than V8 for compute-intensive JS. Acceptable for DOM-manipulation workloads; not acceptable for running full SPAs.
  2. No CSS execution — Stylesheets are fetched but not parsed or applied. Layout is structural only.
  3. No WebSocketEventSource and WebSocket are stubs that emit PolyfillRequested but do not connect.
  4. navigate fetch — The background HTTP fetch may fail if the target server blocks non-browser user agents or requires cookies from a prior session. Use POST /pages/{id}/load-html for reliable DOM injection.
  5. No multi-frame isolation — iframes are parsed as regular elements, not isolated execution contexts.
  6. Single-origin JS — All eval calls share the same JS context per page. There is no per-frame sandbox.

18. Versioning

ByeFast follows semantic versioning. The Event enum and HTTP API are stable within a major version. Internal crate APIs (bye-dom, bye-js, etc.) are not stable and may change between minor versions.


19. License

MIT