Open-source macOS WebDriver for Tauri apps.
Enables automated end-to-end testing of Tauri desktop applications on macOS, where no native WKWebView WebDriver exists.
Disclosure: The code for this project was written in collaboration with Claude Code
- The Problem
- The Solution
- Who Is This For?
- Quick Start
- Local Disk Cleanup
- MCP Integration
- Supported W3C WebDriver Operations
- Architecture
- Alternatives
- Additional info
- License
Tauri apps use WKWebView on macOS. Unlike Linux (WebKitWebDriver) and Windows (Edge WebDriver), Apple does not provide a WebDriver implementation for WKWebView. This means Tauri developers cannot run automated e2e tests on macOS using standard WebDriver tools like WebDriverIO or Selenium.
This is a blocker for any Tauri app with platform-specific code (e.g., deep links, native menus, file associations) that must be tested on every platform.
tauri-webdriver provides two crates that together bridge the gap:
-
tauri-plugin-webdriver-automation-- A Tauri plugin that runs inside your app (debug builds only). It starts a local HTTP server that can interact with your app's webview: find elements, click buttons, read text, manage windows, and execute JavaScript. -
tauri-webdriver-automation(CLI binary:tauri-wd) -- A standalone CLI binary that implements the W3C WebDriver protocol. It launches your Tauri app, connects to the plugin's HTTP server, and translates standard WebDriver commands into plugin API calls. WebDriverIO, Selenium, or any W3C-compatible client can connect to it.
WebDriverIO/Selenium tauri-wd CLI Your Tauri App
(test runner) ──HTTP──> (W3C WebDriver) ──HTTP──> (plugin server)
:4444 :{dynamic port}
- Tauri app developers who need automated e2e tests on macOS
- CI/CD pipelines that run tests and need a macOS solution
- Anyone with platform-specific Tauri code that must be verified on macOS (deep links, native APIs, system integrations)
cd src-tauri
cargo add tauri-plugin-webdriver-automationRegister it in your app (debug builds only):
let mut builder = tauri::Builder::default();
#[cfg(debug_assertions)]
{
builder = builder.plugin(tauri_plugin_webdriver_automation::init());
}cargo install tauri-webdriver-automation// wdio.conf.mjs
export const config = {
port: 4444,
capabilities: [{
'tauri:options': {
binary: './src-tauri/target/debug/my-app',
}
}],
// ... your test config
};If your Tauri app uses a dev server (Vite, Next.js, etc.), it must be running before tauri-wd launches your app. The debug binary loads your frontend from devUrl (e.g., http://localhost:5173), not from embedded files.
# Keep this running in a separate terminal (adjust for your setup)
npx vite --port 5173 # Vite
# npx next dev --port 3000 # Next.js
# npx webpack serve # WebpackTo have your agent (Claude Code, Codex, etc) handle this automatically, add a note to your project's CLAUDE.md or AGENTS.md. Starting a dev server on a port that's already in use may prompt interactively and hang automated tools, so check first and only start if needed:
## WebDriver Automation
The Tauri debug binary loads the frontend from devUrl. Before launching
the app with tauri-wd, ensure these are running:
curl -s http://127.0.0.1:5173 > /dev/null 2>&1 || (cd apps/desktop && npx vite --port 5173 &)
curl -s http://127.0.0.1:4444/status > /dev/null 2>&1 || tauri-wd --port 4444 &# Terminal 1: Start the WebDriver server (supports concurrent sessions)
tauri-wd --port 4444
# Terminal 2: Run your tests
npx wdio run wdio.conf.mjsRust build artifacts can take several GB in this repo. To clean local-only files:
bash scripts/clean.shFor a deeper cleanup (also removes tests/wdio/node_modules and generated files in screenshots/):
bash scripts/clean.sh --deepAfter cleanup, the next build is slower because Rust has to recompile from scratch.
tauri-webdriver works with mcp-tauri-automation to enable AI-driven automation of Tauri apps via the Model Context Protocol. This lets AI agents (like Claude Code) launch, inspect, interact with, and screenshot your Tauri app through natural language.
1. Install the MCP server:
git clone https://github.com/danielraffel/mcp-tauri-automation.git
cd mcp-tauri-automation
npm install && npm run build2. Add to Claude Code:
claude mcp add --transport stdio tauri-automation \
--scope user \
-- node /absolute/path/to/mcp-tauri-automation/dist/index.jsOptionally set a default app path so you don't have to specify it every time:
claude mcp add --transport stdio tauri-automation \
--env TAURI_APP_PATH=/path/to/your-app/src-tauri/target/debug/your-app \
--scope user \
-- node /absolute/path/to/mcp-tauri-automation/dist/index.js3. Start tauri-wd and use with Claude:
# Keep this running in a separate terminal
tauri-wd --port 4444If your app uses a frontend dev server, make sure it's running first (see step 4 in Quick Start).
Then ask Claude: "Launch my Tauri app and take a screenshot"
Note: mcp-tauri-automation is a fork of Radek44/mcp-tauri-automation with additional tools (execute_script, get_page_title, get_page_url, multi-strategy selectors, configurable timeouts, wait_for_navigation). These additions have been submitted upstream. For cross-platform MCP support, see the original project.
All operations follow the W3C WebDriver specification. See the full technical specification for detailed request/response formats and plugin API documentation.
| W3C Endpoint | Method | Description |
|---|---|---|
/status |
GET | Server readiness status |
/session |
POST | Create a new session with tauri:options capabilities |
/session/{id} |
DELETE | Delete session and terminate the app |
/session/{id}/timeouts |
GET | Get current timeout configuration |
/session/{id}/timeouts |
POST | Set implicit, page load, and script timeouts |
| W3C Endpoint | Method | Description |
|---|---|---|
/session/{id}/url |
POST | Navigate to URL |
/session/{id}/url |
GET | Get current page URL |
/session/{id}/title |
GET | Get page title |
/session/{id}/source |
GET | Get full page HTML source |
/session/{id}/back |
POST | Navigate back in history |
/session/{id}/forward |
POST | Navigate forward in history |
/session/{id}/refresh |
POST | Refresh the current page |
| W3C Endpoint | Method | Description |
|---|---|---|
/session/{id}/window |
GET | Get current window handle |
/session/{id}/window |
POST | Switch to window by handle |
/session/{id}/window |
DELETE | Close current window |
/session/{id}/window/handles |
GET | Get all window handles |
/session/{id}/window/new |
POST | Create a new window |
/session/{id}/window/rect |
GET | Get window position and size |
/session/{id}/window/rect |
POST | Set window position and size |
/session/{id}/window/maximize |
POST | Maximize window |
/session/{id}/window/minimize |
POST | Minimize window |
/session/{id}/window/fullscreen |
POST | Make window fullscreen |
| W3C Endpoint | Method | Description |
|---|---|---|
/session/{id}/element |
POST | Find element (CSS, XPath, tag name, link text, partial link text) |
/session/{id}/elements |
POST | Find all matching elements |
/session/{id}/element/active |
GET | Get the currently focused element |
/session/{id}/element/{eid}/element |
POST | Find element scoped to a parent element |
/session/{id}/element/{eid}/elements |
POST | Find all elements scoped to a parent |
/session/{id}/element/{eid}/click |
POST | Click an element |
/session/{id}/element/{eid}/clear |
POST | Clear an input element |
/session/{id}/element/{eid}/value |
POST | Send keystrokes to an element (file paths for <input type="file">) |
/session/{id}/element/{eid}/text |
GET | Get element's visible text |
/session/{id}/element/{eid}/name |
GET | Get element's tag name |
/session/{id}/element/{eid}/attribute/{name} |
GET | Get an HTML attribute value |
/session/{id}/element/{eid}/property/{name} |
GET | Get a JavaScript property value |
/session/{id}/element/{eid}/css/{name} |
GET | Get a computed CSS property value |
/session/{id}/element/{eid}/rect |
GET | Get element's bounding rectangle |
/session/{id}/element/{eid}/enabled |
GET | Check if element is enabled |
/session/{id}/element/{eid}/selected |
GET | Check if element is selected |
/session/{id}/element/{eid}/displayed |
GET | Check if element is visible |
/session/{id}/element/{eid}/computedrole |
GET | Get computed ARIA role |
/session/{id}/element/{eid}/computedlabel |
GET | Get computed ARIA label |
| W3C Endpoint | Method | Description |
|---|---|---|
/session/{id}/element/{eid}/shadow |
GET | Get shadow root of a web component |
/session/{id}/shadow/{sid}/element |
POST | Find element inside a shadow root |
/session/{id}/shadow/{sid}/elements |
POST | Find all elements inside a shadow root |
| W3C Endpoint | Method | Description |
|---|---|---|
/session/{id}/frame |
POST | Switch to frame by index, element reference, or null for top |
/session/{id}/frame/parent |
POST | Switch to parent frame |
| W3C Endpoint | Method | Description |
|---|---|---|
/session/{id}/execute/sync |
POST | Execute synchronous JavaScript |
/session/{id}/execute/async |
POST | Execute asynchronous JavaScript with callback |
| W3C Endpoint | Method | Description |
|---|---|---|
/session/{id}/screenshot |
GET | Full page screenshot (base64 PNG) |
/session/{id}/element/{eid}/screenshot |
GET | Element screenshot (base64 PNG) |
| W3C Endpoint | Method | Description |
|---|---|---|
/session/{id}/cookie |
GET | Get all cookies |
/session/{id}/cookie/{name} |
GET | Get a cookie by name |
/session/{id}/cookie |
POST | Add a cookie |
/session/{id}/cookie/{name} |
DELETE | Delete a cookie by name |
/session/{id}/cookie |
DELETE | Delete all cookies |
| W3C Endpoint | Method | Description |
|---|---|---|
/session/{id}/alert/dismiss |
POST | Dismiss (cancel) the current dialog |
/session/{id}/alert/accept |
POST | Accept (OK) the current dialog |
/session/{id}/alert/text |
GET | Get the dialog message text |
/session/{id}/alert/text |
POST | Send text to a prompt dialog |
| W3C Endpoint | Method | Description |
|---|---|---|
/session/{id}/actions |
POST | Perform actions: key (keyDown/keyUp), pointer (move/down/up), wheel (scroll) |
/session/{id}/actions |
DELETE | Release all actions |
| W3C Endpoint | Method | Description |
|---|---|---|
/session/{id}/print |
POST | Print page to PDF (base64-encoded) |
Two Rust crates work together in a simple 2-hop design:
WebDriverIO/Selenium ──HTTP:4444──> tauri-wd CLI ──HTTP:{dynamic}──> tauri-plugin-webdriver-automation
(W3C WebDriver) (axum server in-app)
The plugin (tauri-plugin-webdriver-automation) runs inside your Tauri app in debug builds. On startup it binds an axum HTTP server to 127.0.0.1 on a random port and prints [webdriver] listening on port {N} to stdout. It injects a JavaScript bridge (init.js) into every webview that provides element finding, an async script callback mechanism, dialog interception, and an in-memory cookie store (needed because WKWebView doesn't support document.cookie on tauri:// URLs). All DOM interaction happens by evaluating JS in the webview and receiving results back via Tauri IPC.
The CLI (tauri-wd) is a standalone binary that implements the W3C WebDriver HTTP protocol on port 4444. When a test framework creates a session, the CLI launches your app binary, watches stdout for the port announcement, and then translates every W3C request into a plugin HTTP call. Elements are tracked as (selector, index, using) triples internally, mapped to W3C UUID strings for the session lifetime. Shadow DOM elements use a separate in-memory cache since document.querySelectorAll() can't reach into shadow roots. Frame/iframe context is managed by a stack that scopes JS evaluation to the correct contentDocument.
Session flow: Test client sends POST /session with tauri:options.binary pointing to your app. The CLI spawns the binary with TAURI_WEBVIEW_AUTOMATION=true, reads the plugin port from stdout, and returns a session ID. All subsequent W3C commands are forwarded to the plugin as JSON-over-HTTP POST requests. When the session is deleted, the app process is killed.
For the full internal API reference, see SPEC.md.
When we started building this, we were developing a macOS Tauri app and frustrated that there was no easy way to do automated e2e testing -- Apple doesn't provide a WebDriver for WKWebView. We ended up building this at roughly the same time others tackled the same problem. Probably not the greatest use of time in hindsight, but here we are.
- tauri-plugin-webdriver -- Open-source Tauri plugin that embeds a WebDriver server directly in the plugin (single-crate architecture vs our two-crate approach). Supports macOS, Linux, and Windows -- if you need cross-platform WebDriver support, this is the more mature choice.
- CrabNebula Cloud -- Commercial hosted testing service with macOS WebDriver support.
- Blog post with some additional background info.
MIT OR Apache-2.0 (dual-licensed, same as Tauri)