Skip to content

feat(ckpt): add cron-based scheduled checkpoint snapshots#819

Open
Ziqi002 wants to merge 1 commit into
alibaba:mainfrom
Ziqi002:feat/ckpt/cron-ckpt
Open

feat(ckpt): add cron-based scheduled checkpoint snapshots#819
Ziqi002 wants to merge 1 commit into
alibaba:mainfrom
Ziqi002:feat/ckpt/cron-ckpt

Conversation

@Ziqi002

@Ziqi002 Ziqi002 commented Jun 10, 2026

Copy link
Copy Markdown
Collaborator

Description

  • 为 hermes 和 openclaw 插件新增基于 cron 表达式的定时快照功能
  • 新增 cronSchedules 配置项,支持 add/remove/set 子操作和 5 字段 cron 表达式校验

hermes 插件

  • 新增 cron.py 模块,实现 CrontabManager 管理定时快照调度;tools.py 新增 cron 相关 tool 注册;config.py 扩展支持 cronSchedules 配置。
  • 将 singleton 和 workspace 辅助函数从 __init__.py 移至 checkpoint_manager.py,解决循环导入问题

openclaw 插件

  • 新增 cron.ts 模块,实现 TypeScript 版 CrontabManagercommands.tshandlers.ts 新增 cron 相关命令和事件处理;openclaw.plugin.json 注册新 tool。

为什么用系统 crontab 而非进程内定时器或 daemon 内调度

进程内定时器不可行。 hermes 只在 tool call 时激活,openclaw 随 Gateway 重启丢状态,进程退出即失效。

daemon 内调度可行但代价偏高。 daemon 是常驻的,但它的职责是管理 btrfs 子卷生命周期。加调度意味着新增配置通道(daemon 要感知 cron 表达式和 schedule 增删改)和新的故障面(重启后恢复定时器、精度依赖事件循环)。对于「按固定间隔调 CLI」的需求,这些复杂度不必要。

系统 crontab:

  • 生命周期独立于所有 ws-ckpt 进程,不受插件或 daemon 重启影响
  • 用户 crontab -l 即可审计,无需任何进程在线
  • 不需要持久化触发时间,cron 天然幂等
  • 直接调用 ws-ckpt checkpoint CLI,复用已有的错误处理和并发控制

设计决策

per-workspace schedule 列表
配置以 {workspace_path: [cron_expr]} 存储。切换 workspace 时 CrontabManager.migrate() 自动清理旧条目、安装新条目。

sub-action 增量更新
通过 add / remove / set 子操作修改 schedule,LLM 无需先读完整列表再整体回写。

crontab 并发保护
hermes 用 fcntl.flock,openclaw 用 mkdirSync 目录锁,共享 /tmp/ws-ckpt-cron.lock,防止 TOCTOU 竞态。

循环导入重构
_get_manager() / cwd_inside_workspace()__init__.py 移至 checkpoint_manager.py,消除 tools.py 的反向 lazy import。

Related Issue

no-issue: 新功能,无对应 tracking issue

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Refactoring (no functional change)
  • Performance improvement
  • CI/CD or build changes

Scope

  • cosh (copilot-shell)
  • sec-core (agent-sec-core)
  • skill (os-skills)
  • sight (agentsight)
  • tokenless (tokenless)
  • memory (agent-memory)
  • ckpt (ws-ckpt)
  • Multiple / Project-wide

Checklist

  • I have read the Contributing Guide
  • My code follows the project's code style
  • I have added tests that prove my fix is effective or that my feature works
  • I have updated the documentation accordingly
  • For cosh: Lint passes, type check passes, and tests pass
  • For sec-core (Rust): cargo clippy -- -D warnings and cargo fmt --check pass
  • For sec-core (Python): Ruff format and pytest pass
  • For skill: Skill directory structure is valid and shell scripts pass syntax check
  • For sight: cargo clippy -- -D warnings and cargo fmt --check pass
  • For tokenless: cargo clippy -- -D warnings and cargo fmt --check pass
  • For memory (Linux only): cargo clippy --all-targets -- -D warnings, cargo fmt --check, and cargo test pass
  • Lock files are up to date (package-lock.json / Cargo.lock)

Testing

Additional Notes

- add CrontabManager for hermes and openclaw plugin
- add cronSchedules config key with add/remove/set sub-actions and
  5-field cron expression validation
- move singleton and workspace helpers from __init__.py to
  checkpoint_manager.py to resolve circular imports

Signed-off-by: Ziqi Huang <ziqi02@alibaba-inc.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant