Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,14 @@ This project follows [Semantic Versioning 2.0.0](https://semver.org/spec/v2.0.0.

## [Unreleased]

### Added

- `start --cdp-port` now fails fast with a clear, actionable error when the web port is already bound, instead of timing out after 90s. The error message names the busy port and suggests running `fsa stop` or selecting a different port via `--port` (issue #25).

### Fixed

- `start --cdp-port` now reaps the spawned Chrome process, flutter web-server, FIFO pipe, and temporary profile directory when launch fails after the port probe (issue #25). Previously, failed CDP sessions could leave orphaned processes and lingering files. Cleanup is best-effort; cleanup failures are ignored and never mask the original error.

## [0.0.6] - 2026-05-28

### Added
Expand Down
261 changes: 203 additions & 58 deletions lib/src/commands/start_command.dart
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,15 @@ typedef CdpFifoMaker = Future<void> Function(String path);
/// be exercised without a real flutter process writing to the log.
typedef CdpWebServerReadyWaiter = Future<void> Function(File logFile);

/// Probes whether [port] is available for binding on loopback.
/// Returns `true` when the port is free, `false` when already in use.
typedef CdpPortProbe = Future<bool> Function(int port);

/// Sends [signal] to the process identified by [pid]. Mirrors
/// [Process.killPid]; swapped in tests to record best-effort reap calls on
/// the failure-cleanup path without touching real processes.
typedef CdpKillPid = bool Function(int pid, [ProcessSignal signal]);

/// Bridges [CdpProcessStarter]'s nullable [ProcessStartMode?] to
/// [Process.start]'s non-nullable parameter with a default.
Future<Process> _defaultProcessStart(
Expand Down Expand Up @@ -88,22 +97,24 @@ Future<Process> _defaultProcessStart(
/// activates `--web-experimental-hot-reload` for `-d web-server` per
/// flutter/flutter#170612).
/// 2. Resolves a Chrome binary path (macOS bundle or Linux PATH).
/// 3. Pre-launches Chrome detached with `--remote-debugging-port=N`,
/// 3. Probes the web port ([defaultPortProbe]) to confirm it is free.
/// If the port is busy the command exits 1 immediately and spawns nothing.
/// 4. Pre-launches Chrome detached with `--remote-debugging-port=N`,
/// `--remote-allow-origins=*`, and a dedicated `--user-data-dir`.
/// 4. Probes the debug port to confirm Chrome is reachable.
/// 5. Runs `flutter run -d web-server --web-port=<port> --web-experimental-hot-reload`
/// 5. Probes the debug port to confirm Chrome is reachable.
/// 6. Runs `flutter run -d web-server --web-port=<port> --web-experimental-hot-reload`
/// (silent remap from `--device=chrome` because the chrome target would
/// auto-launch its own conflicting Chrome).
/// 6. Waits for the web-server log line "is being served at" so the URL
/// 7. Waits for the web-server log line "is being served at" so the URL
/// is bound before any client attempts to connect.
/// 7. Navigates the pre-launched Chrome to the served URL via CDP
/// 8. Navigates the pre-launched Chrome to the served URL via CDP
/// Page.navigate. DWDS only emits "Debug service listening on ..."
/// AFTER a debugger client connects, so the navigate must happen
/// BEFORE the VM Service scrape; scraping first would deadlock the
/// handshake (see commit 871d0a7).
/// 8. Scrapes the VM Service URI from the flutter run log (DWDS prints
/// 9. Scrapes the VM Service URI from the flutter run log (DWDS prints
/// the "Debug service listening on ..." line once Chrome connected).
/// 9. Writes `chromePid`, `cdpPort`, and `tmpProfileDir` to the state file
/// 10. Writes `chromePid`, `cdpPort`, and `tmpProfileDir` to the state file
/// so [StopCommand] can reap Chrome on teardown.
///
/// D6 Chrome reaper (POSIX, chrome target only) defers to V1.x; V1 ships
Expand Down Expand Up @@ -165,6 +176,37 @@ class StartCommand extends ArtisanCommand {
@visibleForTesting
static CdpWebServerReadyWaiter? cdpWebServerReadyWaiter;

/// Test seam: web port availability probe. Returns `true` when the port
/// is free to bind; `false` when already in use. Defaults to
/// [defaultPortProbe], which attempts [ServerSocket.bind] and immediately
/// closes the socket. Swap in tests to simulate a busy port.
@visibleForTesting
static CdpPortProbe cdpPortProbe = defaultPortProbe;

/// Test seam: process-kill-by-pid (default [Process.killPid]). Used by the
/// failure-cleanup path to SIGTERM the flutter holder + child when launch
/// throws after the PIDs were captured.
@visibleForTesting
static CdpKillPid cdpKillPid = Process.killPid;

/// Default [CdpPortProbe] implementation. Binds [ServerSocket] on
/// [InternetAddress.loopbackIPv4] and immediately closes it.
/// Returns `true` when the port is free; `false` on [SocketException]
/// (port already in use).
@visibleForTesting
static Future<bool> defaultPortProbe(int port) async {
try {
final socket = await ServerSocket.bind(
InternetAddress.loopbackIPv4,
port,
);
await socket.close();
return true;
} on SocketException {
return false;
}
}

@override
void configure(ArgParser parser) {
parser
Expand Down Expand Up @@ -332,7 +374,19 @@ class StartCommand extends ArtisanCommand {
return 1;
}

// 4. Launch Chrome detached with debug port + dedicated user-data-dir.
// 4. Probe web port availability before spawning anything. A busy port
// means flutter will fail immediately after Chrome is up, so fail fast
// here and spawn nothing.
final webPortFree = await cdpPortProbe(webPort);
if (!webPortFree) {
ctx.output.error(
'Port $webPort is already in use. Run `fsa stop` to free it or '
'pass a different --port value.',
);
return 1;
}

// 5. Launch Chrome detached with debug port + dedicated user-data-dir.
final tmpRoot = cdpTmpProfileDirRoot ?? '/tmp';
final tmpProfileDir = '$tmpRoot/dusk-chrome-$cdpPort';
final chromeProcess = await cdpProcessStarter(
Expand Down Expand Up @@ -387,7 +441,7 @@ class StartCommand extends ArtisanCommand {
mode: ProcessStartMode.detached,
);

// 5. Probe Chrome to confirm it opened the debug port; on failure kill it
// 6. Probe Chrome to confirm it opened the debug port; on failure kill it
// so we never leak a runaway Chrome with no parent supervision.
try {
await cdpChromeProber(cdpPort, const Duration(seconds: 10));
Expand All @@ -399,14 +453,12 @@ class StartCommand extends ArtisanCommand {
return 1;
}

// 6. Build flutter argv: always -d web-server here (chrome target would
// auto-launch its own conflicting Chrome).
// 7. Build flutter argv: always -d web-server here (chrome target would
// auto-launch its own conflicting Chrome). Path construction below is
// pure (no I/O); the side-effecting log + FIFO creation happens inside
// the try so a failure there still reaps the already-launched Chrome.
final logFile = File('${_logDir()}/flutter-dev.log');
await logFile.parent.create(recursive: true);
await logFile.writeAsString('');

final fifoPath = '${_logDir()}/flutter-dev.fifo';
await _ensureFifo(fifoPath);

final flutterArgs = <String>[
'run',
Expand All @@ -419,56 +471,149 @@ class StartCommand extends ArtisanCommand {
'--dart-define=AI_TEST=1',
];

// 7. Spawn flutter with the existing FIFO wrapper pattern.
final process = await _spawnFlutterWrapper(
flutterArgs: flutterArgs,
fifoPath: fifoPath,
logFile: logFile,
);
// The flutter wrapper handle + captured PIDs are held nullable so the
// failure-cleanup catch can reap whatever was already spawned, regardless
// of which step (log/FIFO setup, PID capture, navigate, scrape) threw.
Process? flutterProcess;
int? holderPid;
int? childPid;
try {
// 8. Prepare the log file + FIFO. Inside the try so a failure here still
// reaps the already-launched Chrome + tmp profile dir.
await logFile.parent.create(recursive: true);
await logFile.writeAsString('');
await _ensureFifo(fifoPath);

// 9. Spawn flutter with the existing FIFO wrapper pattern.
flutterProcess = await _spawnFlutterWrapper(
flutterArgs: flutterArgs,
fifoPath: fifoPath,
logFile: logFile,
);

final pids = await _scrapeTwoPids(process);
final holderPid = pids['HOLDER'];
final childPid = pids['FLUTTER'];
if (holderPid == null || childPid == null) {
throw StateError(
'Failed to capture child PIDs from start wrapper: $pids',
final pids = await _scrapeTwoPids(flutterProcess);
holderPid = pids['HOLDER'];
childPid = pids['FLUTTER'];
if (holderPid == null || childPid == null) {
throw StateError(
'Failed to capture child PIDs from start wrapper: $pids',
);
}

// 10. Wait for the web server to be ready (look for "is being served at"
// line in the log), then navigate Chrome FIRST so the debug service
// has a client to emit the VM Service URI to. Scraping the URI before
// navigation deadlocks: -d web-server only emits "Debug service
// listening on ws://..." AFTER a debugger client connects.
await _runWebServerReadyWait(logFile);
await cdpChromeNavigator(cdpPort, 'http://localhost:$webPort/');

// 11. NOW scrape the VM Service URI emitted by DWDS once Chrome connected.
final vmServiceUri = await _runVmServiceScrape(logFile);

// 12. Write state with the new CDP fields so StopCommand can reap Chrome.
await StateFile.write(<String, dynamic>{
'pid': childPid,
'stdinPipe': fifoPath,
'stdinHolderPid': holderPid,
'vmServiceUri': vmServiceUri,
'webPort': webPort,
'vmServicePort': vmServicePort,
'startedAt': DateTime.now().toUtc().toIso8601String(),
'profile': profileStatic ? 'static' : 'debug',
'projectRoot': Directory.current.path,
'device': device,
'chromePid': chromeProcess.pid,
'tmpProfileDir': tmpProfileDir,
'cdpPort': cdpPort,
});

ctx.output.success('chrome pid=${chromeProcess.pid} (cdpPort=$cdpPort)');
ctx.output.success('flutter run pid=$childPid');
ctx.output.success('vmServiceUri=$vmServiceUri');
ctx.output.success('state=${StateFile.path}');
ctx.output.success('log=${logFile.path}');
return 0;
} catch (error) {
// 13. Best-effort reap of everything launched above so a post-Chrome
// failure leaks no Chrome, no flutter web-server, no FIFO, no tmp
// profile dir. Every action is individually guarded and swallows so
// one cleanup failure cannot abort the rest; the error surfaced to
// the operator is the ORIGINAL throw, never a cleanup error. This is
// best-effort SIGTERM only (no SIGKILL grace loop): the OS reaps
// detached children and `fsa stop` is the deliberate full reaper.
_reapAfterCdpFailure(
flutterProcess: flutterProcess,
holderPid: holderPid,
childPid: childPid,
chromeProcess: chromeProcess,
fifoPath: fifoPath,
tmpProfileDir: tmpProfileDir,
);
ctx.output.error('CDP start failed after launch: $error');
return 1;
}
}

// 8. Wait for the web server to be ready (look for "is being served at"
// line in the log), then navigate Chrome FIRST so the debug service
// has a client to emit the VM Service URI to. Scraping the URI before
// navigation deadlocks: -d web-server only emits "Debug service
// listening on ws://..." AFTER a debugger client connects.
await _runWebServerReadyWait(logFile);
await cdpChromeNavigator(cdpPort, 'http://localhost:$webPort/');
/// Best-effort reap of every child the CDP branch spawned, invoked only from
/// the failure-cleanup catch in [_handleCdpBranch]. Mirrors the kill + rm
/// cascade of `StopCommand._reapChrome` but without a SIGKILL grace loop:
/// this is the failure path, the handles are still held, and `fsa stop`
/// remains the deliberate full reaper.
///
/// Every action is wrapped in its own `try`/swallow so a single failure
/// (process already gone, missing FIFO, locked profile dir) never aborts the
/// remaining cleanup and never replaces the original error surfaced upstream.
void _reapAfterCdpFailure({
required Process? flutterProcess,
required int? holderPid,
required int? childPid,
required Process chromeProcess,
required String fifoPath,
required String tmpProfileDir,
}) {
// 1. SIGTERM the flutter child + holder. When the PIDs were captured, reap
// by pid (the detached holder + child outlive the wrapper handle);
// otherwise fall back to killing the wrapper Process handle directly.
if (childPid != null || holderPid != null) {
for (final pid in <int?>[childPid, holderPid]) {
if (pid == null) continue;
try {
cdpKillPid(pid, ProcessSignal.sigterm);
} catch (_) {
// Non-fatal: the process may already be gone.
}
}
} else if (flutterProcess != null) {
try {
flutterProcess.kill();
} catch (_) {
// Non-fatal.
}
}

// 9. NOW scrape the VM Service URI emitted by DWDS once Chrome connected.
final vmServiceUri = await _runVmServiceScrape(logFile);
// 2. SIGTERM Chrome via the held handle.
try {
chromeProcess.kill();
} catch (_) {
// Non-fatal.
}

// 10. Write state with the new CDP fields so StopCommand can reap Chrome.
await StateFile.write(<String, dynamic>{
'pid': childPid,
'stdinPipe': fifoPath,
'stdinHolderPid': holderPid,
'vmServiceUri': vmServiceUri,
'webPort': webPort,
'vmServicePort': vmServicePort,
'startedAt': DateTime.now().toUtc().toIso8601String(),
'profile': profileStatic ? 'static' : 'debug',
'projectRoot': Directory.current.path,
'device': device,
'chromePid': chromeProcess.pid,
'tmpProfileDir': tmpProfileDir,
'cdpPort': cdpPort,
});
// 3. Delete the FIFO file.
try {
final fifo = File(fifoPath);
if (fifo.existsSync()) fifo.deleteSync();
} catch (_) {
// Non-fatal: a stale FIFO is harmless.
}

ctx.output.success('chrome pid=${chromeProcess.pid} (cdpPort=$cdpPort)');
ctx.output.success('flutter run pid=$childPid');
ctx.output.success('vmServiceUri=$vmServiceUri');
ctx.output.success('state=${StateFile.path}');
ctx.output.success('log=${logFile.path}');
return 0;
// 4. Delete the tmp profile dir (mirrors _reapChrome's rm).
try {
final dir = Directory(tmpProfileDir);
if (dir.existsSync()) dir.deleteSync(recursive: true);
} catch (_) {
// Non-fatal: a stale profile directory is not worth surfacing.
}
}

/// Spawns the FIFO-wrapped flutter run process. Common to both the default
Expand Down
34 changes: 29 additions & 5 deletions skills/fluttersdk-artisan/references/state-and-recovery.md
Original file line number Diff line number Diff line change
Expand Up @@ -278,21 +278,45 @@ Cause: `artisan_start` on Windows.

V1 limitation. Surface to the user; there is no agent-side recovery.

### `Port <web-port> is already in use` (CDP path)

Cause: `artisan_start --cdp-port=<N> --port=<web-port>` detects that the
web port (the `--port` value, not the CDP port) is already in use and fails
fast before spawning any processes. The error names the busy web port and
suggests running `fsa stop` or selecting a different `--port`.

Recovery:

```bash
lsof -ti tcp:<web-port> # find the squatter on the web port
kill <squatter pid> # or pick a different --port
./bin/fsa start --cdp-port=<N> --port=<new-web-port>
```

The port probe runs before Chrome and the flutter web-server launch, so
no orphaned processes are left. This fail-fast behavior replaces the prior
90s timeout that could leave stale sessions (issue #25).

### `Chrome failed to open debug port <port>`

Cause: `artisan_start --cdp-port=<N>` with a busy port, or Chrome
missing.
Cause: `artisan_start --cdp-port=<N>` with Chrome missing, or a Chrome
initialization failure after the port probe passed.

```bash
lsof -ti tcp:<N> # find the squatter
kill <squatter pid> # or pick a different port
./bin/fsa start --cdp-port=<other>
# Confirm Chrome is installed:
which google-chrome # Linux
ls /Applications/Google\ Chrome.app # macOS
```

If Chrome is missing: the `Chrome binary not found` message names the
expected path (macOS: `/Applications/Google Chrome.app/Contents/MacOS/Google Chrome`;
Linux: `google-chrome` on PATH).

If Chrome initialization fails after the port probe, `artisan_start` reaps
the spawned Chrome process, flutter web-server, FIFO pipe, and temporary
profile directory before returning an error (issue #25). Run `artisan start`
again once Chrome is healthy.

### `Flutter SDK <X> is older than 3.30.0` (CDP path)

Cause: `artisan_start --cdp-port` requires the WebSocket hot reload fix
Expand Down
Loading