Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions TDD_IMPLEMENTATION_PLAN.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,8 @@

| Component | Location | Status | TDD Value |
|-----------|----------|--------|-----------|
| Standalone MCP server | `adb_vision/server.py` | ✅ **DONE** | 6 tools, DroidCast primary backend, action logging |
| ADB screenshot backends | `adb_vision/screenshot.py` | ✅ **DONE** | DroidCast/u2/scrcpy — 17 unit tests passing |
| Standalone MCP server | `adb_vision/server.py` | ✅ **DONE** | 6 tools, structured audit logging, fixed tool surface |
| ADB screenshot backends | `adb_vision/screenshot.py` | ✅ **DONE** | DroidCast/u2/scrcpy/screenrecord with backend audit coverage |
| ADB raw tap/swipe/keyevent | `adb_vision/server.py` | ✅ **DONE** | Via `adb_tap`, `adb_swipe`, `adb_keyevent` tools |
| ALAS state machine | `alas_wrapped/module/ui/page.py` | Reference only | 43 pages, 98 transitions — extract knowledge, not code |
| MEmu config | `docs/dev/memu_playbook.md` | Documented | Admin-at-startup solved via memuc.exe |
Expand Down Expand Up @@ -58,6 +58,7 @@ def test_screenshot_returns_valid_image():

**Status:** Implemented and tested. See `adb_vision/screenshot.py` (DroidCast/u2/scrcpy backends) and
`adb_vision/test_server.py` (17 unit tests passing). Live test in `adb_vision/test_live.py`.
- Operational note: `adb_vision/diagnose.py` is the first-line health gate when screenshots are black; it must treat live ADB as authoritative even when `memuc.exe` requires elevation.

### P0-T3: Raw Input Tests
```python
Expand Down
5 changes: 3 additions & 2 deletions adb_vision/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ adb_vision/

| Tool | Description |
|------|-------------|
| `adb_screenshot(method)` | Screenshot via pluggable backend (auto/droidcast/scrcpy/u2/screencap) |
| `adb_screenshot(method)` | Screenshot via pluggable backend (auto/droidcast/scrcpy/u2/screenrecord/screencap) |
| `adb_tap(x, y)` | Tap coordinate |
| `adb_swipe(x1, y1, x2, y2, duration_ms)` | Swipe gesture |
| `adb_keyevent(keycode)` | Send key event (4=BACK, 3=HOME) |
Expand All @@ -48,11 +48,12 @@ adb_vision/

The screenshot problem: **`adb shell screencap` returns blank images on MEmu/VirtualBox** because the GPU never populates the Linux framebuffer.

Three alternative backends are being implemented (see GitHub issues #40-#42):
Alternative backends for MEmu/VirtualBox:

1. **DroidCast** (#40) — APK that streams screen over HTTP via SurfaceControl API
2. **scrcpy** (#41) — H.264 stream decoded to single frame
3. **uiautomator2 ATX** (#42) — ATX agent HTTP API screenshot endpoint
4. **screenrecord** — short MP4 capture + ffmpeg first-frame extraction; slower, but works when the live HTTP paths fail

The `method="auto"` default tries each backend in order until one returns a valid (>5KB) image.

Expand Down
189 changes: 189 additions & 0 deletions adb_vision/desktop_capture.ps1
Original file line number Diff line number Diff line change
@@ -0,0 +1,189 @@
<#
.SYNOPSIS
Capture a screenshot of the MEmu emulator window using .NET drawing.
.PARAMETER OutputPath
Path to save the PNG screenshot.
.PARAMETER AutoDelete
If set, deletes the screenshot after this many seconds. Default: 300 (5 min).
#>
param(
[Parameter(Mandatory=$true)]
[string]$OutputPath,

[int]$AutoDelete = 300
)

Add-Type -AssemblyName System.Windows.Forms
Add-Type -AssemblyName System.Drawing
Add-Type -AssemblyName Microsoft.VisualBasic

# Find the actual MEmu VM window first.
# Exact title lookup for "MEmu" can resolve to a tiny hidden helper window,
# which produces useless 50x15 captures. Prefer the real process window.
Add-Type @"
using System;
using System.Runtime.InteropServices;
public class WinAPI {
[DllImport("user32.dll")]
public static extern IntPtr GetForegroundWindow();

[DllImport("user32.dll")]
[return: MarshalAs(UnmanagedType.Bool)]
public static extern bool SetForegroundWindow(IntPtr hWnd);

[DllImport("user32.dll")]
[return: MarshalAs(UnmanagedType.Bool)]
public static extern bool GetWindowRect(IntPtr hWnd, out RECT lpRect);

[DllImport("user32.dll")]
[return: MarshalAs(UnmanagedType.Bool)]
public static extern bool IsIconic(IntPtr hWnd);

[DllImport("user32.dll")]
[return: MarshalAs(UnmanagedType.Bool)]
public static extern bool ShowWindow(IntPtr hWnd, int nCmdShow);

[DllImport("user32.dll")]
[return: MarshalAs(UnmanagedType.Bool)]
public static extern bool PrintWindow(IntPtr hWnd, IntPtr hdcBlt, int nFlags);
}

public struct RECT {
public int Left;
public int Top;
public int Right;
public int Bottom;
}
"@

function Get-WindowRectObject {
param([IntPtr]$Handle)
$rect = New-Object RECT
if (-not [WinAPI]::GetWindowRect($Handle, [ref]$rect)) {
return $null
}
[pscustomobject]@{
Rect = $rect
Width = $rect.Right - $rect.Left
Height = $rect.Bottom - $rect.Top
Area = ($rect.Right - $rect.Left) * ($rect.Bottom - $rect.Top)
}
}

function Get-CandidateWindow {
param([System.Diagnostics.Process[]]$Processes)

$best = $null
foreach ($proc in $Processes) {
if ($proc.MainWindowHandle -eq 0) { continue }
$title = ($proc.MainWindowTitle | Out-String).Trim()
if (-not $title) { continue }

$rectInfo = Get-WindowRectObject -Handle ([IntPtr]$proc.MainWindowHandle)
if ($null -eq $rectInfo) { continue }
if ($rectInfo.Width -le 200 -or $rectInfo.Height -le 200) { continue }

$candidate = [pscustomobject]@{
ProcessId = $proc.Id
Handle = [IntPtr]$proc.MainWindowHandle
Title = $title
Rect = $rectInfo.Rect
Width = $rectInfo.Width
Height = $rectInfo.Height
Area = $rectInfo.Area
}

if ($null -eq $best -or $candidate.Area -gt $best.Area) {
$best = $candidate
}
}
return $best
}

$memuWindow = Get-CandidateWindow -Processes (
Get-Process -Name "MEmu" -ErrorAction SilentlyContinue
)

if ($null -eq $memuWindow) {
$fallbackProcesses = Get-Process | Where-Object {
$_.MainWindowHandle -ne 0 -and (
$_.ProcessName -like "*MEmu*" -or
$_.MainWindowTitle -like "*(MEmu*" -or
$_.MainWindowTitle -like "*MEmu*"
)
}
$memuWindow = Get-CandidateWindow -Processes $fallbackProcesses
}

if ($null -eq $memuWindow) {
Write-Error "MEmu window not found"
exit 1
}

$memuHwnd = $memuWindow.Handle

# Save the currently active window so we can restore it
$previousWindow = [WinAPI]::GetForegroundWindow()

# If MEmu is minimized, restore it
if ([WinAPI]::IsIconic($memuHwnd)) {
[WinAPI]::ShowWindow($memuHwnd, 9) # SW_RESTORE
Start-Sleep -Milliseconds 500
}

# Bring MEmu to front
[WinAPI]::ShowWindow($memuHwnd, 5) | Out-Null # SW_SHOW
try {
[Microsoft.VisualBasic.Interaction]::AppActivate($memuWindow.ProcessId) | Out-Null
} catch {
[WinAPI]::SetForegroundWindow($memuHwnd) | Out-Null
}
Start-Sleep -Milliseconds 300

# Get window rect
$rect = $memuWindow.Rect
$width = $memuWindow.Width
$height = $memuWindow.Height

if ($width -le 0 -or $height -le 0) {
Write-Error "Invalid window dimensions: ${width}x${height}"
# Restore previous window
[WinAPI]::SetForegroundWindow($previousWindow) | Out-Null
exit 1
}

# Capture the screen region
$bitmap = New-Object System.Drawing.Bitmap($width, $height)
$graphics = [System.Drawing.Graphics]::FromImage($bitmap)
$hdc = $graphics.GetHdc()
$printed = $false
try {
$printed = [WinAPI]::PrintWindow($memuHwnd, $hdc, 2)
} finally {
$graphics.ReleaseHdc($hdc)
}
if (-not $printed) {
$graphics.CopyFromScreen($rect.Left, $rect.Top, 0, 0,
(New-Object System.Drawing.Size($width, $height)))
}
$graphics.Dispose()

# Save
$dir = Split-Path -Parent $OutputPath
if ($dir -and !(Test-Path $dir)) { New-Item -ItemType Directory -Path $dir -Force | Out-Null }
$bitmap.Save($OutputPath, [System.Drawing.Imaging.ImageFormat]::Png)
$bitmap.Dispose()

Write-Output "Screenshot saved: $OutputPath"

# Restore previous window
[WinAPI]::SetForegroundWindow($previousWindow) | Out-Null

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥 The Roast: Start-Job fires a background PowerShell job inside a script that's invoked as a one-shot subprocess from Python. When this script exits, the background job dies with it — it's not a persistent daemon, it's a fire-and-forget that gets immediately forgotten by the OS. Your 5-minute auto-delete should be called the "optimistic cleanup feature" because it will never run unless the caller holds the PS session open.

🩹 The Fix: Either use a scheduled task (Register-ScheduledJob/Register-ScheduledTask) for deferred cleanup, or just let the Python caller handle deletion after it's done consuming the file. PS Start-Job only works reliably in interactive sessions, not in -Command one-shots.

📏 Severity: warning

# Schedule auto-delete
if ($AutoDelete -gt 0) {
Start-Job -ScriptBlock {
param($path, $delay)
Start-Sleep -Seconds $delay
if (Test-Path $path) { Remove-Item $path -Force }
} -ArgumentList $OutputPath, $AutoDelete | Out-Null
}
Loading