MCP server v2: non-blocking primitives, health probe, CLI mode

## Problem

The current MCP server (`agent_orchestrator/alas_mcp_server.py`) has three architectural flaws that block reliable LLM-driven operation:

### 1. Blocking tool calls lock the agent

`alas_login_ensure_main` calls `LoginHandler.handle_app_login()`, which is a `while 1:` loop that can spin for up to **300 seconds** before timing out (`LOGIN_MAX_TOTAL_SECONDS = 300`). When called via MCP over stdio, the entire JSON-RPC transport is blocked. The agent cannot take screenshots, check health, or do anything else — it's stuck waiting for a Python function that may never return.

This is the root cause of the agent getting "stuck" when calling login.

### 2. No ADB health probe

When ADB dies or the emulator isn't rendering, `adb_screenshot` either hangs or returns a black frame. There's no way to ask "is the connection alive?" before committing to a call. The agent can't distinguish between:
- Game loading screen (wait)
- ADB transport broken (reconnect)
- Emulator process dead (restart)

### 3. No CLI mode

`alas_mcp_server.py` only runs as `mcp.run(transport="stdio")`. There's no way to run `python alas_mcp_server.py screenshot` from a terminal. When something goes wrong, you can't poke at it without an MCP client.

### 4. Tool proliferation anti-pattern

`alas_login_ensure_main` is a dedicated MCP tool wrapping one ALAS workflow. Following this pattern would require `alas_commission_run`, `alas_dorm_collect`, etc. — dozens of tools that are all thin wrappers.

`alas_call_tool(name)` already exists and can invoke any registered tool. Dedicated workflow wrappers should not exist as separate MCP tools.

## Proposed Design

### Tool surface (all non-blocking or bounded)

| Tool | What it does | Max time |
|------|-------------|----------|
| `adb_health` | **NEW** - Check ADB transport, emulator process, return structured status | < 2s |
| `adb_screenshot` | Take screenshot (existing) | < 3s |
| `adb_tap` | Tap coordinate (existing) | < 1s |
| `adb_swipe` | Swipe (existing) | < 1s |
| `alas_state` | Current page name (rename from `alas_get_current_state`) | < 3s |
| `alas_goto` | Navigate with timeout (existing, add bound) | < 30s |
| `alas_tools` | List available tools (rename from `alas_list_tools`) | < 1s |
| `alas_run` | Run a named tool with timeout (rename from `alas_call_tool`) | < 60s |

### Remove

- `alas_login_ensure_main` — login becomes the agent calling screenshot/tap/state in a loop, not a single blocking function

### Add: `adb_health` tool

```python
@mcp.tool()
def adb_health() -> Dict[str, Any]:
    """Check ADB and emulator connectivity.
    
    Returns structured status:
    {
        "adb_connected": bool,
        "emulator_running": bool,  
        "screenshot_ok": bool,
        "serial": str,
        "error": str | None
    }
    """
```

### Add: CLI mode

```bash
# Debug from terminal without MCP client
python alas_mcp_server.py cli --health
python alas_mcp_server.py cli --screenshot out.png
python alas_mcp_server.py cli --state
python alas_mcp_server.py cli --tap 640 360
python alas_mcp_server.py cli --run commission.run
python alas_mcp_server.py cli --list-tools

# MCP mode (existing, default)
python alas_mcp_server.py         # stdio MCP server
```

### Add: Timeout wrapper on all tools

Every MCP tool call should be wrapped with a configurable timeout so that a hung ADB call doesn't block the transport indefinitely.

## Acceptance Criteria

- [ ] `alas_login_ensure_main` removed from MCP tool surface
- [ ] `adb_health` tool added and returning structured connectivity status
- [ ] All MCP tools have bounded execution time (timeout wrapper)
- [ ] CLI mode added (`python alas_mcp_server.py cli --<command>`)
- [ ] Tool names simplified (`alas_get_current_state` → `alas_state`, etc.)
- [ ] Black screenshot detection: `adb_screenshot` reports if image is all-black

## Context

- Current MCP server: `agent_orchestrator/alas_mcp_server.py` (300 lines)
- Blocking login handler: `alas_wrapped/module/handler/login.py` (300s `while 1:` loop)
- Related: Issue #21 (Tool Node design) describes verification patterns that depend on this foundation
- CLAUDE.md mandates: "MCP-only control path" and "the LLM agent IS the scheduler"

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MCP server v2: non-blocking primitives, health probe, CLI mode #35

Problem

1. Blocking tool calls lock the agent

2. No ADB health probe

3. No CLI mode

4. Tool proliferation anti-pattern

Proposed Design

Tool surface (all non-blocking or bounded)

Remove

Add: `adb_health` tool

Add: CLI mode

Add: Timeout wrapper on all tools

Acceptance Criteria

Context

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Tool	What it does	Max time
`adb_health`	NEW - Check ADB transport, emulator process, return structured status	< 2s
`adb_screenshot`	Take screenshot (existing)	< 3s
`adb_tap`	Tap coordinate (existing)	< 1s
`adb_swipe`	Swipe (existing)	< 1s
`alas_state`	Current page name (rename from `alas_get_current_state`)	< 3s
`alas_goto`	Navigate with timeout (existing, add bound)	< 30s
`alas_tools`	List available tools (rename from `alas_list_tools`)	< 1s
`alas_run`	Run a named tool with timeout (rename from `alas_call_tool`)	< 60s

MCP server v2: non-blocking primitives, health probe, CLI mode #35

Description

Problem

1. Blocking tool calls lock the agent

2. No ADB health probe

3. No CLI mode

4. Tool proliferation anti-pattern

Proposed Design

Tool surface (all non-blocking or bounded)

Remove

Add: adb_health tool

Add: CLI mode

Add: Timeout wrapper on all tools

Acceptance Criteria

Context

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

Add: `adb_health` tool