Skip to content

feat(forge): mutation testing#13091

Draft
gakonst wants to merge 152 commits into
masterfrom
mutation-testing-fast
Draft

feat(forge): mutation testing#13091
gakonst wants to merge 152 commits into
masterfrom
mutation-testing-fast

Conversation

@gakonst
Copy link
Copy Markdown
Member

@gakonst gakonst commented Jan 15, 2026

Summary

Implements parallel mutation testing for Foundry, building on PR #11996 from @emo-eth.

Key Features

  • Parallel execution via rayon thread pool with configurable --mutation-jobs
  • Isolated TempDir workspaces per mutant for safe concurrent execution
  • Adaptive span skipping shared across workers (skip mutations in spans where a mutation already survived).
    Note: the number of skipped/invalid mutants may vary with worker count since more parallelism means more mutants start testing before any complete and mark spans as survived.
  • Fail-fast per mutant (stops on first test failure)
  • Symlinks for lib directories to avoid expensive copies
  • Preserves project layout (custom src/test/libs paths supported)

Safety Improvements

  • Path traversal protection (rejects ../ components)
  • catch_unwind prevents single panic from aborting entire run
  • Symlinked directories skipped in copy to prevent traversal attacks
  • 16MB stack size for rayon threads to avoid overflow

Performance

~2.5x speedup observed with 4 workers on 150 mutants (Vault.sol test contract):

  • 4 workers: 15.3s
  • 1 worker: 37.9s

Usage

# Run mutation testing with 4 parallel workers
forge test --mutate src/Contract.sol --mutation-jobs 4

# Auto-detect parallelism (defaults to available CPUs)
forge test --mutate src/Contract.sol --mutation-jobs 0

Changes

  • New file: crates/forge/src/mutation/runner.rs (526 lines) - parallel runner
  • Modified: crates/forge/src/cmd/test/mod.rs - CLI integration
  • Tests: 31 unit tests + 2 CLI integration tests

Closes #478
Closes OSS-1


Built on the excellent foundation from #11996

Regular run
image

Run with progress
image

Final report:
image

@zerosnacks zerosnacks marked this pull request as draft April 13, 2026 15:15
Co-authored-by: zerosnacks <95942363+zerosnacks@users.noreply.github.com>
Amp-Thread-ID: https://ampcode.com/threads/T-019d873c-bf66-7018-96d5-0d0e7d34e540
Comment thread crates/forge/src/mutation/runner.rs Outdated
Comment on lines +437 to +440
// Construct EVM env using the EthEvmNetwork type (mutation testing always uses Eth)
let (evm_env, tx_env, fork_block) = evm_opts
.env::<SpecFor<EthEvmNetwork>, BlockEnvFor<EthEvmNetwork>, TxEnvFor<EthEvmNetwork>>()
.await?;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an incorrect assumption, should support all networks

Co-authored-by: zerosnacks <95942363+zerosnacks@users.noreply.github.com>
Amp-Thread-ID: https://ampcode.com/threads/T-019d873c-bf66-7018-96d5-0d0e7d34e540
Comment thread crates/config/src/lib.rs Outdated
Comment thread crates/forge/src/workspace.rs Outdated
decofe and others added 6 commits April 13, 2026 15:33
Co-authored-by: zerosnacks <95942363+zerosnacks@users.noreply.github.com>
Amp-Thread-ID: https://ampcode.com/threads/T-019d873c-bf66-7018-96d5-0d0e7d34e540
Co-authored-by: zerosnacks <95942363+zerosnacks@users.noreply.github.com>
Amp-Thread-ID: https://ampcode.com/threads/T-019d873c-bf66-7018-96d5-0d0e7d34e540
Co-authored-by: zerosnacks <95942363+zerosnacks@users.noreply.github.com>
Amp-Thread-ID: https://ampcode.com/threads/T-019d873c-bf66-7018-96d5-0d0e7d34e540
Co-authored-by: zerosnacks <95942363+zerosnacks@users.noreply.github.com>
Amp-Thread-ID: https://ampcode.com/threads/T-019d873c-bf66-7018-96d5-0d0e7d34e540
- Fix typo: 'Pathe' -> 'Path' in mutation_dir doc comment
- Add depth limit (max 10) to symlink_nested_libs to prevent infinite recursion
- Support all network types (Eth/OP/Tempo) in mutation testing, not just EthEvmNetwork

Co-authored-by: zerosnacks <95942363+zerosnacks@users.noreply.github.com>
Amp-Thread-ID: https://ampcode.com/threads/T-019d873c-bf66-7018-96d5-0d0e7d34e540
Co-authored-by: zerosnacks <95942363+zerosnacks@users.noreply.github.com>
Amp-Thread-ID: https://ampcode.com/threads/T-019d873c-bf66-7018-96d5-0d0e7d34e540
@zerosnacks zerosnacks changed the title feat(forge): parallel mutation testing with isolated workspaces feat(forge): mutation testing May 7, 2026
Amp-Thread-ID: https://ampcode.com/threads/T-019e0314-c110-735c-8858-123078e60e37
Co-authored-by: Amp <amp@ampcode.com>

# Conflicts:
#	Cargo.lock
#	crates/forge/src/cmd/test/mod.rs
Comment thread crates/forge/Cargo.toml Outdated
zerosnacks and others added 2 commits May 7, 2026 18:07
Each parameterized case becomes a standalone #[test] (parallel execution,
individual failure reporting, IDE run buttons) without pulling in rstest.

Also fixes a pre-existing clippy::question-mark warning in
assignment_mutator.rs surfaced by make lint.

Amp-Thread-ID: https://ampcode.com/threads/T-019e0314-c110-735c-8858-123078e60e37
Co-authored-by: Amp <amp@ampcode.com>
Copy link
Copy Markdown
Collaborator

@figtracer figtracer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nits

source_files_iter(&config.src, MultiCompilerLanguage::FILE_EXTENSIONS)
.filter(|entry| entry.is_sol() && !entry.is_sol_test() && pattern.is_match(entry))
.collect()
} else if let Some(contract_pattern) = &mutation_config.mutate_contract_pattern {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--mutate-contract only filters files, then mutates every contract in the selected file; this should probably have per contract filtering

Comment thread crates/forge/src/mutation/mod.rs Outdated
self.config
.root
.join(&self.config.mutation_dir)
.join(format!("{hash}_{stem}_{path_hash:x}.{ext}"))
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cache filename only includes build id + path hash.

i reproduced stale results after changing [profile.default.mutation].exclude_operators; the command reused cached binary-op mutants even though binary-op was disabled

}

// Load survived spans for adaptive mutation testing
handler.retrieve_survived_spans(&build_id);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

survived-span cache is loaded even when mutant/results cache misses or fails to deserialize

"Shl" => Ok(BinOpKind::Shl),
"Shr" => Ok(BinOpKind::Shr),
"Sar" => Ok(BinOpKind::Sar),
other => Err(serde::de::Error::custom(format!("Unknown BinOpKind: {other}"))),
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing Pow and Rem

Self::BinaryOp(kind) => write!(f, "{}", kind.to_str()),
Self::BinaryOpExpr { mutated_expr, .. } => write!(f, "{mutated_expr}"),
Self::DeleteExpression => write!(f, "assert(true)"),
Self::ElimDelegate => write!(f, "call"),
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ElimDelegate displays as bare call, but the mutator uses the full target.delegatecall(data) span


// Build the rest of the call (message argument if present)
let rest_args = if args_exprs.len() > 1 {
let first_comma = original.find(',').unwrap_or(original.len());
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this might break when we have commas on the condition

ExprKind::Member(expr, ident) => {
match expr.kind {
ExprKind::Ident(inner) => {
format!("{}{}", ident.as_str(), inner.to_string())
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

boxValue.value++ generates ++valueboxValue because of this branch here

might be related to the @todo below

Copy link
Copy Markdown
Collaborator

@mablr mablr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it worths to add a mutation score threshold flag to return failure above a certain level, would be useful for CIs.

Comment thread crates/forge/src/cmd/test/mod.rs Outdated
pub mutate: Option<Vec<PathBuf>>,

/// Specify which files to mutate with glob pattern matching.
#[arg(long, value_name = "PATTERN", requires = "mutate")]
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#[arg(long, value_name = "PATTERN", requires = "mutate")]
#[arg(long, value_name = "PATTERN", conflicts_with = "mutate")]

--mutate and --mutate-path seem to silently conflict.

zerosnacks and others added 9 commits May 19, 2026 13:42
- mutant.rs: add missing Pow/Rem variants to BinOpKind deserializer
- elim_delegate_mutator: narrow mutation span to the 'delegatecall'
  identifier so replacement does not clobber the surrounding call
- unary_op_mutator: fix Member operand swap (e.g. 'a.b++' was producing
  '++ba')
- require_mutator: use AST argument spans instead of string-splitting
  on the first comma, so 'require(foo(a, b))' is handled correctly
- mutation/mod.rs: include hash of enabled operators in the cache file
  name so changes to include/exclude_operators invalidate the cache
- orchestrator: only load survived-spans cache after mutants are
  successfully obtained
- visitor: add per-contract name filter; visit_item_contract toggles
  an 'in_allowed_contract' flag that gates expr/var/yul mutation
  collection, so --mutate-contract filters per contract rather than
  per file
- cmd/test: add clap conflicts_with between --mutate-path and
  --mutate-contract; add runtime error when --mutate has explicit
  paths combined with --mutate-path

Amp-Thread-ID: https://ampcode.com/threads/T-019e4020-a7ef-70a8-84e0-a44a82f56780
Co-authored-by: Amp <amp@ampcode.com>
Mirrors invariant.timeout for mutation campaigns:
- MutationConfig.timeout / --mutation-timeout SECS CLI flag
- MutationResult::TimedOut variant (separate from Invalid/Alive)
- Per-mutant enforcement via std::thread + mpsc::recv_timeout in
  runner::run_compile_and_test_with_timeout; worker slot freed
  immediately on timeout (background work unwinds via gas limit)
- Adaptive survived-span cache only marked for genuine Alive results
- Cache key folds in timeout so changed budget invalidates results
- Reporter row + legend + JSON 'timed_out' field

Progress UX (needs further iteration on formatting):
- Live tally on overall bar (k:/s:/i:/t:/sk:) + elapsed_precise
- Keyed active-mutant tracking so parallel completions remove the
  correct row instead of FIFO
- Per-result completion lines printed above bars via multi.suspend

Tested end-to-end: 70-mutant project with --mutation-timeout 2 sees
4 mutants TimedOut at exactly 2.0s, score correctly excludes them.

Unit tests + CLI mutation snapshot tests pass.

Amp-Thread-ID: https://ampcode.com/threads/T-019e4020-a7ef-70a8-84e0-a44a82f56780
Co-authored-by: Amp <amp@ampcode.com>
…clean up progress UX

- runner: propagate config.mutation.timeout into temp_config.fuzz.timeout and
  temp_config.invariant.timeout so the inner FuzzTestTimer bails out at the
  deadline. Previously the outer recv_timeout returned TimedOut but the leaked
  worker thread kept running expensive fuzz/invariant runs and starved the
  pool — a 200k-runs fuzz test would not respect --mutation-timeout at all.
  Never raises a user-configured fuzz/invariant timeout, only lowers it.

- progress: drop the fake static '[0.0s]' prefix from in-flight spinner
  messages. Replace the emoji-prefixed completion lines (✗/⚠/⏱) with a plain
  color-coded label (KILLED green, SURVIVED red, TIMED OUT yellow) padded to
  9 chars so columns align after ANSI escapes.

- reporter: strip prescriptive prose. Drops the 'These mutations were NOT
  caught…' preamble, the entire Security Implications section, and the
  Suggestions to improve test coverage block. Shortens the legend to factual
  definitions. Removes the emoji (⚠/✓/ℹ/⏱) and parenthetical opinions from
  the survived/killed/invalid/timed-out section headers.

- Disambiguate MutationProgress::clear via UFCS to silence yansi Paint::clear
  trait shadow warning.

- Snapshot tests regenerated for the new reporter shape.

Amp-Thread-ID: https://ampcode.com/threads/T-019e4061-d738-70bd-8f43-3fc319258ed2
Co-authored-by: Amp <amp@ampcode.com>
- runner: gate OpEvmNetwork import + dispatch behind #[cfg(feature = "optimism")]
  so cargo check --no-default-features builds. Mirrors the pattern already used
  in cmd/test/mod.rs.

- tests/cli/config.rs: update test_default_config snapshot for the new
  mutation.timeout field.

- Re-run cargo +nightly fmt across the mutation module touched by previous
  commits.

Amp-Thread-ID: https://ampcode.com/threads/T-019e4061-d738-70bd-8f43-3fc319258ed2
Co-authored-by: Amp <amp@ampcode.com>
Blockers:
- runner: move TempDir ownership into worker thread so timed-out
  workers can no longer race against a deleted workspace; park
  JoinHandles in SharedMutationState.pending_workers and join them
  at the end of the parallel run to actually reclaim cleanup
- workspace: add ensure_within_root containment check (canonicalize
  + starts_with) for src/test/lib/node_modules/dependencies so a
  symlinked root cannot escape the project; validate nested lib
  dirs from untrusted dependency foundry.toml (reject .., absolute
  paths, symlinked nested roots); use entry.file_type() to skip
  symlinked entries instead of following them

Highs:
- orchestrator: do not persist cached results on a cancelled or
  short run; would otherwise be reloaded as the authoritative
  answer for the file
- mutation/mod: include --mutate-contract regex in cache key so
  cached results from a different filter aren't silently reused
- cmd/test: bail when mutation is requested with inline per-test
  network overrides (single-pass runner can't honor them) or with
  ffi/write fs_permissions (shared symlinked dep trees aren't safe
  to mutate from tests)

Adds 4 workspace tests for the new symlink-escape protections.

Amp-Thread-ID: https://ampcode.com/threads/T-019e4567-e7ca-717e-bcc0-bb67a3667c4d
Co-authored-by: Amp <amp@ampcode.com>
* fix(mutation): key cached results by execution inputs

* fix(mutation): include execution inputs in cache key

* fix: clean-up
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

tracking: add mutation testing support

8 participants