AutoTester: Docker-based in-game integration test framework#491
Merged
Conversation
a05272d to
6c555d5
Compare
Adds a docker-based autotest framework that runs real Minecraft server + client containers, drives the UI through a file-based JSON bridge (AutoTestBridge), and validates the modpack sync flow against 22 version/loader targets. Includes mod-side changes required to expose UI state for testing: AutoTestBridge, async certificate trust helpers, EditBox inputText persistence, and infrastructure cleanup for unsupported MC versions.
Adds a docker-based autotest framework that runs real Minecraft server + client containers, drives the UI through a file-based JSON bridge (AutoTestBridge), and validates the modpack sync flow against 22 version/loader targets. Includes mod-side changes required to expose UI state for testing: AutoTestBridge, async certificate trust helpers, EditBox inputText persistence, and infrastructure cleanup for unsupported MC versions.
…re discovered via autotester
…sions + better docs why
…heck, click_restart error handling - Fix TOCTOU races in accept_fingerprint and wait_join: capture get_screen once instead of calling it 4x per iteration (was reading different screen states) - Add _jitter_sleep helper with ±20% random jitter to all poll loops, reducing thundering herd file-I/O contention when multiple tests run concurrently under jobs > 1 - _container_logs now accepts tail=N to fetch only last N lines instead of full log; _wait_for_log and _read_fingerprint use tail=200 - _phase_connect: add _assert_running health check, increase inner poll from 15s to 45s (overloaded PC may need longer), compute remaining time - _phase_click_restart: raise RuntimeError if no restart button found instead of silently passing and hanging on _wait_exited - BridgeClient.request accepts per-call timeout= kwarg; wait_bridge uses timeout=5 for initial ping (was 30s, kept stalling when Java bridge hadn't entered its poll loop yet) - BridgeClient internal poll uses random.uniform(0.03, 0.07) jitter
… to a server before we even load all of the minecraft assets
thenApply inherits the completing thread (pool-5-thread-* from DownloadClient's executor), where Forge's ModuleClassLoader.findClass throws CNFE for classes loaded for the first time. switch to thenApplyAsync so the lambda lands on ForkJoinPool.commonPool(), whose workers can resolve classes normally.
the .handle callback inherited the Connection executor thread (pool-5-thread-*) where ModuleClassLoader.findClass can fail for unresolved classes. switch to handleAsync so it runs on ForkJoinPool.commonPool() like the DataC2SPacket fix.
…, clean up Production fixes: - DataC2SPacket: restore original flow so the client secret is still saved when the modpack content can't be fetched but the host is reachable (the async rewrite had dropped that case via an early return). - DownloadClient: remove the now-dead sync constructor + establishProbeConnection /recoverProbeConnection (superseded by createAsync). - Rename ModpackUpdater.CheckAndLoadModpack -> checkAndLoadModpack. - Revert the whole-file spaces->tabs reindentation of Preload.java; keep only the real change (load installed modpack when updateSelectedModpackOnLaunch is off). - Normalize stray tabs -> spaces (DataC2SPacket, ModpackUtils, ScreenImpl, ClientLoginNetworkAddon). - Drop unused AutoTestBridge imports from Fabric/Forge/NeoForge init. Autotester: - Remove the unused render/menu/close bridge ops and their backing code (FontRenderMixin, RenderedTextCollector, FormattedText, mixins.json entry). - Remove unused BridgeClient helpers (buttons, text_fields, click_point). - Clearer phase names: wait_danger -> wait_download_prompt, click_confirm -> confirm_download, click_restart -> confirm_restart. - Quiet the dev mixins: single client-ready log in onClientReady(), drop the 100ms INFO spam and System.out.println. - cli.py: move the run() return out of the finally block so real errors aren't swallowed. - README: correct phase table and bridge-op list; document skip_fingerprint and verify_mods.
Rebased onto main's 26.2 port and hooked 26.2 into the autotester for both Fabric and NeoForge. Build/wiring: - ModuleUtils: map 26.2 neoforge onto our fml11 module (main used fml10; this branch moved 26.x to fml11). - ScreenImpl.setScreen: add the >=26.2 gui.setScreen conditional to match getScreen (kept our non-backgroundExecutor variant). - stonecutter: active project set to 26.2-fabric. - autotester targets: add 26.2-fabric (loader 0.19.3) and 26.2-neoforge (26.2.0.7-beta), Java 25. Deps: - Gradle 9.4.1 -> 9.6.0, fabric-loom(+remap) 1.16 -> 1.17-SNAPSHOT (loom 1.17 needs Gradle 9.5+), shadow 9.4.1 -> 9.4.3, kotlin 2.3.0 -> 2.3.21. moddevgradle already latest (2.0.141). AutoTestBridge / dev mixin: read the current screen via the existing ScreenManager().getScreen() accessor and set the title screen via minecraft.setScreen, so the 26.2 gui.* moves are covered by the port's existing stonecutter replacements + ScreenImpl conditional. No new stonecutter replacements added. Builds clean for 26.2 fabric+neoforge (and 1.21.11 sanity). The 26.2 in-game autotest still fails: HeadlessMC 2.9.0's LWJGL stub leaves org.lwjgl.system.Configuration.SHARED_LIBRARY_EXTRACT_PATH null, which MC 26.2's new NativeLibrariesBootstrap reads -> NPE before any mod loads. Needs a HeadlessMC-side fix.
91fa68d to
d258a20
Compare
Stock HeadlessMC can't launch MC 26.2 headlessly (its LWJGL stubs don't satisfy 26.2's new render backend). Build the client image's HeadlessMC launcher from a git repo/ref instead of downloading the prebuilt native release: - docker/client/Dockerfile: multi-stage build that clones HEADLESSMC_REPO at HEADLESSMC_REF and compiles the launcher-wrapper jar with JDK 21, installed as a `java -jar` hmc shim. - settings.yaml: headlessmc.repo/ref select the build (defaults to the patched Skidamek/headlessmc @ mc26.2-headless); point elsewhere to use another build. - cli.py: pass repo/ref through as Docker build args. - README: document the HeadlessMC build source. Verified: full sync matrix passes for all 22 targets (both loaders, MC 1.18.2 through 26.2).
5229e53 to
2a34cb2
Compare
…regation Address self-review comments on the autotester branch: - neoforge: remove the forgified-fabric-api dependency, its >=1.21 guard and the FFAPI maven repo. Nothing in the neoforge source needs it — FabricInit is fabric-gated and FabricLoginMixin targets by string + @pseudo — and it was never bundled or declared in neoforge.mods.toml. Verified 1.21.8/26.2 neoforge still compile. - mixins: drop the BlockPos no-op workaround in LoginQueryResponse/Request login mixins. Keep targeting the real packet class (reversed to its old name on <1.20.2 by the existing stonecutter replacement) and gate only the body. Verified the no-op path (1.18.2/1.19.2) and the injection path (1.21.8/26.2). - ingame-tests.yml: drop `merge-multiple: true` on the report download — it collided every target's results.json into one, so aggregation only saw a single target. - run-headlessmc-client: remove the dead commented forge/neoforge install blocks; document why only fabric needs an explicit profile install. - settings.yaml: remove the unused run.retryMax key. - autotester package: remove the empty __init__.py and switch packaging to namespace discovery. Claude-Session: https://claude.ai/code/session_01AQ1GKvoVqnwharKmXpbwSz
…ad + net robustness - Test instrumentation (AutoTestBridge + mixin/dev/*) is excluded from the source set and stripped from the mixin config in normal builds, and only bundled with -Pautomodpack.autotest. build.yml gains an `autotest` input; the in-game-tests workflow and README build with it. Verified: fabric & neoforge release jars contain none of it; autotest-flagged jars do. - Fix "disable update-on-launch" so the installed modpack still loads: split a pure ModpackUpdater.loadModpack() (no server contact, no file reconciliation) out of checkAndLoadModpack and route Preload's updates-disabled path to it, so a binary search no longer loses or rewrites the modpack. - DownloadClient: close every connection opened during pool hydration if any parallel connect fails (was leaking sockets + non-daemon threads); run the blocking probe / cert prompt / login continuation on a dedicated daemon executor instead of ForkJoinPool.commonPool (DownloadClient, ModpackUtils, DataC2SPacket). - Pin the HeadlessMC fork to a commit SHA; the client Dockerfile now fetches by ref (branch/tag/SHA) for reproducible autotester builds. Claude-Session: https://claude.ai/code/session_01AQ1GKvoVqnwharKmXpbwSz
…, macros) Scenarios are now data, not Python phases. A flow is a list of steps, where each step is a generic verb (click / type / wait_for / assert / verify_files / ...) plus arguments, so new tests are written entirely in YAML. Engine (automodpack_autotester/engine/): - registry: @verb decorator + name->fn lookup - context: per-case state, ${...} templating, bridge/log access - selectors: declarative GUI element matching (role/text/class/enabled/index) - conditions: boolean predicates (screen/element/file/log/all/any/not) shared by when:, wait_for.until:, assert.that:; log conditions capture regex groups into vars - steps_ui / steps_io: the UI and filesystem verbs - executor: macro expansion, when-gating, repeat, optional, per-step results runner.py keeps the Docker lifecycle helpers and registers the lifecycle verbs (launch_server, connect, wait_join, ...); run_case builds a Context and runs the flow through the engine, recording per-step results into results.json. Scenarios rewritten declaratively on a shared macro library (scenarios/_lib.yaml: boot, accept_certificate, download_modpack, restart_client, rejoin). Behavior (selectors, screen names, fingerprint regex, connect retry) matches the old phases exactly. Tests (tests/, no Docker): 33 unit + flow tests covering parsing, selectors, conditions, templating, polling, and the executor, plus running the real shipped scenarios/macros through a fake bridge. Verified end to end on real Docker: 1.21.1-fabric sync passes (21 steps, full boot -> sync -> restart -> rejoin). README documents the verb/selector/condition/template/macro model.
- executor: `when`/`repeat` now apply to `use`/`group` steps too (were silently
ignored); normalize a bare-string step once and gate/repeat uniformly.
- config: add `parse_server_files()` (shared by runner + tests, removes the
triplicated serverFiles schema + default constants); memoize `load_macros()`.
- engine: drop dead `Context.case_dir`; add `${modpack_dir}` template var to
replace the repeated `automodpack/modpacks/${modpack}` literal.
- steps_ui/steps_io: extract `_await_element` / `_await_exist` to remove the
duplicated resolve-selector and wait-for-paths boilerplate.
- remove unused `relaunch_client` verb alias and `engine.get` re-export.
- tests: lock in when/repeat-on-macros behavior; reuse parse_server_files.
34 unit/flow tests pass; 1.21.1-fabric sync e2e still green.
Add reusable step.autotester-tests workflow that runs the new pytest engine suite, and wire it into: - gradle.yml (Dev Builds) so every push validates the autotester engine - ingame-tests.yml as a fast-fail gate before the build + Docker matrix No change to release.yml (releases still never pass -Pautomodpack.autotest).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
An end-to-end test harness for AutoModpack's sync flow. It spins up a real Minecraft server + headless client in Docker, drives the client's in-game UI through a small file-based JSON bridge, and verifies a full sync — connect → trust certificate → download → restart → rejoin — across every supported target (MC 1.18.2–26.2, fabric/forge/neoforge).
Building it surfaced several real bugs, so their fixes are included here too.
Writing tests (
autotester/)Tests are declarative YAML — no Python needed to add one. A scenario's
flowis a list of steps built from generic verbs:Verbs (
click,type,wait_for,verify_files, …), selectors (match GUI elements by text/role/state), conditions (screen,element,file,log, …),${...}templating, and reusable macros are documented inautotester/README.md. New behavior a verb can't express is added once in the engine, then reused from YAML.CLI:
autotester build-images | run | clean. CI: the in-game matrix runs on manual dispatch (.github/workflows/ingame-tests.yml); the engine also has fast, Docker-free unit tests that run on every push.Test code never ships in releases
AutoTestBridgeand itsdevmixins are compiled and bundled only under-Pautomodpack.autotest(autotester builds pass it; releases don't). In a normal build they're excluded from the source set and stripped from the mixin config, and they're additionally gated at runtime behind-Dautomodpack.autotest=true. Verified for both fabric and neoforge.HeadlessMC
Stock HeadlessMC can't launch MC 26.2 headlessly, so the client image builds a patched fork. The repo/ref live in
autotester/settings.yaml, pinned to a commit SHA for reproducibility; pointrefat the upstream tag once the patch lands.Bug fixes surfaced while building it
updateSelectedModpackOnLaunch=falsethe installed modpack wasn't loaded at all; it now loads without contacting the server, so you can binary-search mods without AutoModpack restoring or deleting them.ForkJoinPool.commonPool.fml10/fml11).Other
build, notmergeJardirectly).How to run
Results land in
autotester/out/. Seeautotester/README.mdfor the full reference.