Skip to content

Feature: Dry-run / simulation mode #34

@deepjoy

Description

@deepjoy

Summary

Add a simulation mode where tasks can be submitted, scheduled, and planned without actually executing, returning the projected execution plan for preview.

Motivation

Many CLI tools support a --dry-run flag that shows what would happen without doing it. When the task scheduler owns the execution plan (ordering, grouping, dependencies, concurrency), the consumer has to maintain a separate code path to simulate the plan without TaskMill — effectively duplicating the scheduling logic.

In an S3 sync engine, mantle sync --dry-run needs to show:

  • Which files would be transferred, deleted, or conflict-resolved
  • What order they'd execute in (respecting depends_on chains)
  • How they'd be grouped across endpoints
  • Estimated transfer time based on IO budgets and concurrency limits
  • Which tasks would be deduplicated or superseded

Today this requires building the SyncPlan, converting to TaskSubmission values, and then not submitting them — instead formatting them manually. All the scheduling intelligence (dependency resolution, group balancing, priority ordering) is lost. If TaskMill had a dry-run mode, we'd submit normally and get back a rich execution preview.

Proposed Behavior

Submission

let plan = scheduler.dry_run(
    tasks.into_iter().map(|t| {
        TaskSubmission::new("file-transfer")
            .group(&endpoint)
            .depends_on(deps)
            .priority(priority)
            .expected_net_io(0, size as i64)
            .payload_json(&transfer)?
    }).collect()
).await?;

Returned Plan

struct DryRunPlan {
    /// Tasks in projected execution order
    phases: Vec<DryRunPhase>,
    /// Summary statistics
    summary: DryRunSummary,
}

struct DryRunPhase {
    /// Tasks that would execute concurrently in this phase
    tasks: Vec<DryRunTask>,
    /// Why this phase boundary exists
    reason: PhaseReason,  // ConcurrencyLimit, DependencyWait, GroupLimit
}

struct DryRunTask {
    task_type: String,
    dedup_key: Option<String>,
    group: Option<String>,
    priority: Priority,
    tags: HashMap<String, String>,
    expected_net_bytes: i64,
    depends_on: Vec<TaskId>,
    /// What would happen to this task
    outcome: DryRunOutcome,
}

enum DryRunOutcome {
    WouldExecute,
    WouldSupersede { existing_task_id: TaskId },
    WouldDeduplicate { existing_task_id: TaskId },
    WouldExpire { reason: String },  // TTL would expire before execution
    WouldBlock { blocked_by: Vec<TaskId> },
}

struct DryRunSummary {
    total_tasks: usize,
    would_execute: usize,
    would_deduplicate: usize,
    would_supersede: usize,
    total_net_bytes: i64,
    estimated_phases: usize,
    estimated_duration: Option<Duration>,  // based on IO budgets + concurrency
    groups: HashMap<String, usize>,        // tasks per group
}

CLI Integration

$ mantle sync disaster-recovery --dry-run

Dry Run — disaster-recovery (left → right)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

  Upload:  1,234 files (45.2 GB)
  Delete:     12 files (orphans)
  Skip:      892 files (unchanged)
  Supersede:   3 files (in-flight transfers replaced)

  Endpoints:  s3://us-east-1 → s3://b2-us-west
  Concurrency: 4 transfers, 6/group limit
  Est. phases: 312 (at 4 concurrent)
  Est. time:   ~18 min (at 50 MB/s limit)

  Dependency chains:
    upload data/report.csv → delete data/report-old.csv

  Would expire (TTL):
    (none)

  Use --force to execute.

Design Considerations

  • Dry-run should evaluate existing scheduler state — if tasks are already queued, the dry-run should show dedup/supersede outcomes against them
  • The execution order projection doesn't need to be perfectly accurate (scheduling is dynamic), but should reflect priority, dependency, and group constraint ordering
  • Duration estimates should factor in bandwidth_limit, max_concurrent_tasks, group concurrency limits, and expected_net_io per task
  • Dry-run should be fast — no SQLite writes, no executor instantiation, no network calls
  • Consider supporting dry_run_batch alongside dry_run for single-task preview
  • The plan should be serializable to JSON for machine consumption (--log-format json)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions