Summary
Add a simulation mode where tasks can be submitted, scheduled, and planned without actually executing, returning the projected execution plan for preview.
Motivation
Many CLI tools support a --dry-run flag that shows what would happen without doing it. When the task scheduler owns the execution plan (ordering, grouping, dependencies, concurrency), the consumer has to maintain a separate code path to simulate the plan without TaskMill — effectively duplicating the scheduling logic.
In an S3 sync engine, mantle sync --dry-run needs to show:
- Which files would be transferred, deleted, or conflict-resolved
- What order they'd execute in (respecting
depends_on chains)
- How they'd be grouped across endpoints
- Estimated transfer time based on IO budgets and concurrency limits
- Which tasks would be deduplicated or superseded
Today this requires building the SyncPlan, converting to TaskSubmission values, and then not submitting them — instead formatting them manually. All the scheduling intelligence (dependency resolution, group balancing, priority ordering) is lost. If TaskMill had a dry-run mode, we'd submit normally and get back a rich execution preview.
Proposed Behavior
Submission
let plan = scheduler.dry_run(
tasks.into_iter().map(|t| {
TaskSubmission::new("file-transfer")
.group(&endpoint)
.depends_on(deps)
.priority(priority)
.expected_net_io(0, size as i64)
.payload_json(&transfer)?
}).collect()
).await?;
Returned Plan
struct DryRunPlan {
/// Tasks in projected execution order
phases: Vec<DryRunPhase>,
/// Summary statistics
summary: DryRunSummary,
}
struct DryRunPhase {
/// Tasks that would execute concurrently in this phase
tasks: Vec<DryRunTask>,
/// Why this phase boundary exists
reason: PhaseReason, // ConcurrencyLimit, DependencyWait, GroupLimit
}
struct DryRunTask {
task_type: String,
dedup_key: Option<String>,
group: Option<String>,
priority: Priority,
tags: HashMap<String, String>,
expected_net_bytes: i64,
depends_on: Vec<TaskId>,
/// What would happen to this task
outcome: DryRunOutcome,
}
enum DryRunOutcome {
WouldExecute,
WouldSupersede { existing_task_id: TaskId },
WouldDeduplicate { existing_task_id: TaskId },
WouldExpire { reason: String }, // TTL would expire before execution
WouldBlock { blocked_by: Vec<TaskId> },
}
struct DryRunSummary {
total_tasks: usize,
would_execute: usize,
would_deduplicate: usize,
would_supersede: usize,
total_net_bytes: i64,
estimated_phases: usize,
estimated_duration: Option<Duration>, // based on IO budgets + concurrency
groups: HashMap<String, usize>, // tasks per group
}
CLI Integration
$ mantle sync disaster-recovery --dry-run
Dry Run — disaster-recovery (left → right)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Upload: 1,234 files (45.2 GB)
Delete: 12 files (orphans)
Skip: 892 files (unchanged)
Supersede: 3 files (in-flight transfers replaced)
Endpoints: s3://us-east-1 → s3://b2-us-west
Concurrency: 4 transfers, 6/group limit
Est. phases: 312 (at 4 concurrent)
Est. time: ~18 min (at 50 MB/s limit)
Dependency chains:
upload data/report.csv → delete data/report-old.csv
Would expire (TTL):
(none)
Use --force to execute.
Design Considerations
- Dry-run should evaluate existing scheduler state — if tasks are already queued, the dry-run should show dedup/supersede outcomes against them
- The execution order projection doesn't need to be perfectly accurate (scheduling is dynamic), but should reflect priority, dependency, and group constraint ordering
- Duration estimates should factor in
bandwidth_limit, max_concurrent_tasks, group concurrency limits, and expected_net_io per task
- Dry-run should be fast — no SQLite writes, no executor instantiation, no network calls
- Consider supporting
dry_run_batch alongside dry_run for single-task preview
- The plan should be serializable to JSON for machine consumption (
--log-format json)
Summary
Add a simulation mode where tasks can be submitted, scheduled, and planned without actually executing, returning the projected execution plan for preview.
Motivation
Many CLI tools support a
--dry-runflag that shows what would happen without doing it. When the task scheduler owns the execution plan (ordering, grouping, dependencies, concurrency), the consumer has to maintain a separate code path to simulate the plan without TaskMill — effectively duplicating the scheduling logic.In an S3 sync engine,
mantle sync --dry-runneeds to show:depends_onchains)Today this requires building the
SyncPlan, converting toTaskSubmissionvalues, and then not submitting them — instead formatting them manually. All the scheduling intelligence (dependency resolution, group balancing, priority ordering) is lost. If TaskMill had a dry-run mode, we'd submit normally and get back a rich execution preview.Proposed Behavior
Submission
Returned Plan
CLI Integration
Design Considerations
bandwidth_limit,max_concurrent_tasks, group concurrency limits, andexpected_net_ioper taskdry_run_batchalongsidedry_runfor single-task preview--log-format json)