Skip to content

feat: automated cron jobs for BM25 reindex and vector embeddings#2

Merged
agentic-organization merged 1 commit into
mainfrom
feat/cron-and-index-automation
May 12, 2026
Merged

feat: automated cron jobs for BM25 reindex and vector embeddings#2
agentic-organization merged 1 commit into
mainfrom
feat/cron-and-index-automation

Conversation

@agentic-organization
Copy link
Copy Markdown
Owner

What's added

  • .github/workflows/wiki-index.yml — CI validation of BM25 build on wiki/** changes. Uploads the index as an artifact.
  • tools/cron/daily-ingest-and-index.sh — daily pipeline: optional ingest + BM25 + optional vector embed. Self-contained, works with crontab/systemd/Hermes.
  • tools/cron/wiki-reindex.sh — fast BM25 rebuild after wiki edits. Node-only, safe on any machine.
  • tools/cron/README.md — generic cron setup docs with RAM guidance and model size table.
  • .agents/skills/cron-setup/SKILL.md — Hermes-specific cronjob instructions (daily + wiki-watch patterns).

Design decisions

  • BM25 reindex is cheap and runs everywhere; embeddings are opt-in.
  • Default model stays all-MiniLM-L6-v2 (~90 MB) for 2 GB RAM safety. Larger models and Gemma-based variants are documented as needing 4+ GB.
  • Scripts discover the repo root from their own path — no hardcoded paths.
  • GitHub Action validates index health on PRs but does not commit disposable artifacts.

Exit codes

Code Meaning
0 Success (or dry-run)
1 Invalid argument
2 Ingestion failed
3 BM25 build failed
4 Embedding failed

Adds:
- .github/workflows/wiki-index.yml — CI validation of BM25 build on wiki/** changes
- tools/cron/daily-ingest-and-index.sh — daily pipeline: ingest + BM25 + optional embed
- tools/cron/wiki-reindex.sh — fast BM25 rebuild after wiki edits
- tools/cron/README.md — generic cron setup docs with RAM guidance
- .agents/skills/cron-setup/SKILL.md — Hermes-specific cronjob instructions

Design decisions:
- BM25 reindex is cheap and runs on any machine; embeddings are opt-in.
- Default model stays all-MiniLM-L6-v2 (~90 MB) for 2 GB RAM safety.
- Scripts are self-contained and work with crontab, systemd, or Hermes.
- GitHub Action validates index health on PRs but does not commit artifacts.
@agentic-organization agentic-organization merged commit c3124fd into main May 12, 2026
1 check passed
@agentic-organization agentic-organization deleted the feat/cron-and-index-automation branch May 12, 2026 04:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant