Skip to content

fix: multi-workspace correctness (desktop/API scoping, purge, fail-closed auth)#67

Open
zm2231 wants to merge 7 commits into
openclaw:mainfrom
zm2231:fix/desktop-workspace-filter
Open

fix: multi-workspace correctness (desktop/API scoping, purge, fail-closed auth)#67
zm2231 wants to merge 7 commits into
openclaw:mainfrom
zm2231:fix/desktop-workspace-filter

Conversation

@zm2231

@zm2231 zm2231 commented Jun 23, 2026

Copy link
Copy Markdown

What this fixes

A set of multi-workspace bugs in slacrawl. The common cause: the CLI's default or derived workspace_id quietly acts as a filter when it shouldn't. I hit this mirroring three signed-in Slack workspaces from the desktop source.

Concrete repro. Config has workspace_id set, three workspaces signed into Slack Desktop. sync --source desktop pulled in only the configured workspace and dropped the other two. storage/root-state.json listed all three under workspaces. The DB had one. Nothing printed.

Root cause

runSync passes coalesce(*workspaceID, cfg.WorkspaceID) into the sync options, and cfg.WorkspaceID gets auto-filled from DefaultWorkspaceID() (internal/config/config.go:295-298, :350-361). That default then flows into the desktop ingest filter and the API target resolver, so a workspace you never asked to scope to ends up narrowing the sync. The same path let a global fallback token write one workspace's messages under a different workspace's ID during fan-out.

The six fixes

  1. Desktop scoping. sync --source desktop, sync --source all, and watch no longer inherit the config default. Only an explicit --workspace scopes desktop ingest. New desktopOptionsForSourceAll in internal/syncer/syncer.go, with WorkspaceSet threaded from the CLI.
    Repro: config workspace_id=A, B and C also signed in. Before, only A ingests. After, A, B, C all ingest.

  2. API fan-out. Without --workspace, API sync now runs every configured [[workspaces]] entry, which is what docs/configuration.md already promised. A set or derived workspace_id used to collapse that down to one.

  3. Fail-closed auth. This is the dangerous one. internal/slackapi/api.go overwrote the authenticated auth.TeamID with the requested workspace and never checked the two matched. Pair that with the global token fallback in resolveWorkspaceToken, and a fan-out over a workspace missing its token env authenticates as the global team and stores those messages under the requested workspace's ID. Sync now returns an error when the requested workspace and the authenticated team differ, on both the bot and tail paths. Fan-out names the workspace that failed and leaves the rest alone.
    Repro: [[workspaces]] A and B, unset B's token env. Before, A's data lands labeled as B. After, sync workspace B: ... auth mismatch error and zero mislabeled rows.

  4. Purge keys. Scoped (--workspace) selection and deletion now carry workspace_id end to end (internal/store/purge.go) instead of leaning on (channel_id, ts) alone. The purge default is unchanged: without --workspace, purge stays archive-wide. (An earlier version of this PR also changed that default to the configured workspace; that change was reverted after review since the upgrade policy for a destructive command is a maintainer decision.)

  5. Desktop name-excludes. A channel whose name didn't resolve from Redux used to get dropped whenever any name-based exclude was set. Those stay in now. Explicit channel-ID matches still exclude.

Tests

One regression test per fix, all hermetic (no network, no writes to ~/.slacrawl):

  • TestDesktopSyncDoesNotInheritDefaultWorkspaceFilter, TestDesktopOptionsForSourceAllClearsInheritedWorkspace
  • TestRunSyncTargetsUsesWorkspaceTokensAndRejectsMismatchedAuth injects a fake Slack API and proves distinct tokens map to distinct workspaces and a mismatched token gets rejected with nothing mislabeled
  • TestPurgeMessagesWorkspaceScopeKeepsOtherWorkspaceRows
  • TestIngestDesktopStateAllowsUnknownChannelsForNameExcludes, TestIngestReduxStatesAllowsUnknownChannelsForNameExcludesWithWorkspaceFilter
go test ./...        # Go 1.26.4: 13 packages ok, 0 failures
gofmt -l internal/   # empty
go vet ./...         # clean

One thing I left alone

messages is keyed by (channel_id, ts) and channels/users by id, which assumes Slack IDs are globally unique across workspaces. That holds everywhere except Slack Connect channels shared between two of your own workspaces. I documented it in docs/retention.md and kept the schema as is, since a key migration is a bigger call than this PR should make.

zm2231 added 6 commits June 22, 2026 21:57
Desktop sync and watch silently scoped to the config default workspace_id, so a
multi-workspace user only ingested one signed-in workspace. Desktop now ignores
the config default unless --workspace is explicit; API and read paths unchanged.
Adds regression coverage in app_test.go.
sync --source all routed the config/default workspace into desktop ingest, and
API sync collapsed multi-workspace fan-out when workspace_id was set or derived.
Now only an explicit --workspace scopes desktop ingest or narrows API sync;
otherwise API fans out across all configured workspaces and desktop ingests all.
…spaces for global

Without --workspace, purge deleted across all workspaces and wrote a global
retention floor. Purge now defaults to the configured workspace_id; global purge
is explicit via --all-workspaces, which cannot be combined with --workspace.
…udes

A name-based exclude caused channels whose names weren't resolved from Redux to
be dropped. Unknown-name channels are now kept unless the channel ID itself
matches an exclude selector.
API sync overwrote the authenticated team id with the requested workspace without
verifying they match, so a global-fallback token could sync one workspace and
label its rows as another during multi-workspace fan-out. Sync now errors when the
requested workspace and authenticated team differ (bot and tail paths); fan-out
reports the offending workspace and never mislabels others. Adds a hermetic
fan-out integration test with an injected Slack API.
Workspace-scoped purge selected and deleted by (channel_id, ts) only, relying on
that pair being globally unique across workspaces. Selection and deletion now
carry workspace_id end to end; global purge and retention floors are unchanged.
Documents the message-identity invariant in docs/retention.md.
@clawsweeper

clawsweeper Bot commented Jun 23, 2026

Copy link
Copy Markdown

Codex review: needs changes before merge. Reviewed June 28, 2026, 7:07 PM ET / 23:07 UTC.

Summary
The branch changes CLI and syncer workspace scoping, Slack API authenticated TeamID validation, workspace-scoped purge key handling, desktop name-exclude behavior, retention docs, and regression tests.

Reproducibility: yes. at source level: current main coalesces cfg.WorkspaceID into sync/watch paths and lets opts.WorkspaceID overwrite Slack auth TeamID. The contributor also posted redacted live before/after output for desktop ingest, auth mismatch rejection, and purge scope.

Review metrics: 3 noteworthy metrics.

  • Changed surface: 11 files, +412/-58. The diff spans CLI behavior, Slack API auth, desktop ingest, purge storage/docs, and regression tests.
  • Compatibility-sensitive behaviors: 3 paths changed. Desktop default scoping, watch scoping, and auth mismatch handling can change existing operator workflows after upgrade.
  • Real proof paths: 3 copied live-output scenarios. The contributor supplied live output for desktop ingest, auth mismatch rejection, and purge behavior, but not for the remaining watch scoping gap.

Merge readiness
Overall: 🦪 silver shellfish
Proof: 🦞 diamond lobster
Patch quality: 🦪 silver shellfish
Result: blocked by patch quality or review findings.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

  • [P2] Add watch --workspace with focused test, docs, and completion coverage.
  • [P1] Make maintainer acceptance of desktop scoping and fail-closed auth upgrade behavior visible before merge.

Mantis proof suggestion
An independent Slack Desktop smoke would materially help verify the multi-workspace desktop ingest path in a real profile after the watch scoping repair. A maintainer can ask Mantis to capture proof by posting this exact PR comment:

@openclaw-mantis slack desktop smoke: verify multi-workspace desktop sync ingests all signed-in workspaces without --workspace and honors explicit --workspace filtering.

Risk before merge

  • [P1] Existing users who relied on workspace_id to keep watch scoped will start ingesting every signed-in Slack Desktop workspace, with no watch --workspace replacement in the PR.
  • [P1] One-shot desktop sync and sync --source all also change default scoping semantics, so maintainers should explicitly accept that only --workspace narrows desktop ingest.
  • [P1] Slack API sync/tail now fail closed on requested/authenticated workspace mismatches, so deployments relying on a global fallback token need per-workspace token configuration or explicit maintainer acceptance.
  • [P1] The PR intentionally leaves the existing (channel_id, ts) message identity invariant in place, so Slack Connect collisions across a user's own workspaces remain outside this fix.

Maintainer options:

  1. Add a watch scoping escape hatch (recommended)
    Add watch --workspace, completion/docs coverage, and a focused regression test so users can keep scoped desktop watch behavior without restoring implicit config-default filtering.
  2. Accept the upgrade behavior explicitly
    Maintainers can intentionally accept broader desktop ingestion and fail-closed auth as new defaults, but that decision should be visible before merge.
  3. Split policy-sensitive runtime changes
    If default desktop scoping or token fallback policy needs broader discussion, split lower-risk purge key qualification from the runtime behavior changes.
Copy recommended automerge instruction
@clawsweeper automerge

Special instructions:
Add an explicit --workspace flag to the watch command, thread it into syncer.Options with explicit WorkspaceSet-style semantics, update shell completions/docs, and add a focused regression test proving watch can be scoped without restoring config-default scoping.

Next step before merge

  • [P2] A focused repair can add explicit watch workspace scoping; maintainer acceptance is still needed for the intentional desktop default and auth behavior before final merge.

Security
Cleared: The diff tightens Slack TeamID validation and does not change dependencies, workflows, secret permissions, or external code execution.

Review findings

  • [P1] Add an explicit workspace flag for watch scoping — internal/cli/app.go:1384-1387
Review details

Best possible solution:

Add explicit watch --workspace support, then land the correctness fixes after maintainers accept and document the desktop scoping and fail-closed auth upgrade behavior.

Do we have a high-confidence way to reproduce the issue?

Yes, at source level: current main coalesces cfg.WorkspaceID into sync/watch paths and lets opts.WorkspaceID overwrite Slack auth TeamID. The contributor also posted redacted live before/after output for desktop ingest, auth mismatch rejection, and purge scope.

Is this the best way to solve the issue?

Not quite yet: the main fixes are direct, but removing inherited watch scoping without adding watch --workspace leaves a compatibility break. After that focused repair, maintainer acceptance of the fail-closed auth and desktop default behavior is still required.

Full review comments:

  • [P1] Add an explicit workspace flag for watch scoping — internal/cli/app.go:1384-1387
    This block now runs desktop watch without any WorkspaceID, but watch still only parses --desktop-every and completions/docs expose no --workspace. Existing configs that used workspace_id to keep watch scoped will start importing every signed-in Desktop workspace with no command-line way to retain the old scoped behavior; add watch --workspace and thread it into the desktop sync options, or preserve another compatible scoped path.
    Confidence: 0.91

Overall correctness: patch is incorrect
Overall confidence: 0.84

AGENTS.md: not found in the target repository.

Codex review notes: model internal, reasoning high; reviewed against 905ac75f8337.

Label changes

Label justifications:

  • P1: The PR targets silent multi-workspace omission and possible workspace mislabeling in core Slack archive sync paths, with one remaining compatibility blocker.
  • merge-risk: 🚨 compatibility: Merging changes desktop scoping defaults and currently removes scoped watch behavior without an explicit replacement flag.
  • merge-risk: 🚨 auth-provider: Merging changes Slack token fallback behavior by rejecting requested/authenticated workspace mismatches instead of accepting a global fallback token.
  • rating: 🦪 silver shellfish: Overall readiness is 🦪 silver shellfish; proof is 🦞 diamond lobster and patch quality is 🦪 silver shellfish.
  • status: ⏳ waiting on author: ClawSweeper has contributor-facing work open and is waiting for author action. Sufficient (terminal): The contributor supplied redacted copied live output for after-fix desktop ingest, auth mismatch rejection, scoped purge, and preserved archive-wide purge behavior.
  • proof: sufficient: Contributor real behavior proof is sufficient. The contributor supplied redacted copied live output for after-fix desktop ingest, auth mismatch rejection, scoped purge, and preserved archive-wide purge behavior.
Evidence reviewed

Acceptance criteria:

  • [P1] GOWORK=off go test ./internal/cli ./internal/syncer.
  • [P1] GOWORK=off go test ./...

What I checked:

  • Repository policy check: No target AGENTS.md was present in the slacrawl checkout, and no maintainer notes directory was found.
  • Current main inherits workspace defaults into sync/watch: Current main passes coalesce(*workspaceID, cfg.WorkspaceID) into sync options and passes cfg.WorkspaceID into watch desktop sync, matching the reported default-derived desktop filter path. (internal/cli/app.go:543, 905ac75f8337)
  • Current main permits auth TeamID overwrite: Current main starts with auth.TeamID and then replaces it with opts.WorkspaceID, which supports the reported path where a fallback token can label rows as a different requested workspace. (internal/slackapi/api.go:174, 905ac75f8337)
  • PR removes watch workspace scoping without a replacement flag: At the PR head, runWatch still only parses --desktop-every and now calls desktop sync without WorkspaceID, so users cannot explicitly keep watch scoped. (internal/cli/app.go:1384, a319d4aa0555)
  • PR tracks explicit sync workspace selection: The PR adds flagWasSet/resolveSyncWorkspaceID so one-shot sync only scopes desktop/API behavior when --workspace is explicitly present. (internal/cli/app.go:542, a319d4aa0555)
  • PR fails closed on Slack auth mismatch: The PR routes Sync and Tail through authenticatedWorkspaceID, returning an error when the requested workspace differs from the authenticated Slack TeamID. (internal/slackapi/api.go:174, a319d4aa0555)

Likely related people:

  • vincentkoc: GitHub commit metadata shows this author added ergonomic multi-workspace sync/tail support and documented the fan-out behavior touched by this PR. (role: multi-workspace feature introducer; confidence: high; commits: 360ad4c23b39, 588584e4ea53; files: internal/cli/app.go, internal/config/config.go, internal/slackapi/api.go)
  • steipete: GitHub commit metadata ties this author to the desktop import filter work and the retention purge command in the files changed by this PR. (role: recent desktop, purge, and release-area contributor; confidence: high; commits: 37337884a9f6, 3bea752593c9, f5f549466731; files: internal/slackdesktop/desktop.go, internal/slackdesktop/desktop_test.go, internal/store/purge.go)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@clawsweeper clawsweeper Bot added rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. P1 Urgent regression or broken agent/channel workflow affecting real users now. merge-risk: 🚨 compatibility 🚨 Merging this PR could break existing users, config, migrations, defaults, or upgrades. merge-risk: 🚨 auth-provider 🚨 Merging this PR could break OAuth, tokens, provider routing, model choice, or credentials. labels Jun 23, 2026
Reverts the purge default-scope change. Without --workspace, purge stays
archive-wide as before; --workspace scopes to one workspace. The default-scope
policy change is left for a separate maintainer-sponsored discussion. The
store-level workspace_id key qualification is unchanged.
@zm2231

zm2231 commented Jun 23, 2026

Copy link
Copy Markdown
Author

Thanks for the review. Addressed both blockers.

Purge default scope (compatibility). You are right, that was a silent behavior change to a destructive command, and the upgrade policy is a maintainer call, not mine to bundle in here. I reverted it. Without --workspace, purge stays archive-wide exactly as before. --workspace scopes to one workspace. The --all-workspaces flag and the config-default behavior are gone (commit a319d4a). The only purge change left in this PR is internal: scoped selection and deletion now carry workspace_id through the temp key set, which is a pure correctness fix with no surface change. If the default-scope idea is worth pursuing, I will open it as a separate PR where the upgrade path can be decided.

Fail-closed auth. This one is intentional and I would like to keep it. The prior code overwrote the authenticated team ID with the requested workspace without checking they matched, so a global fallback token could write one workspace's messages under a different workspace's ID. A hard error is the right failure mode for an identity mismatch. During fan-out it fails only the offending target and names it, so the other workspaces still sync. Operator action when it fires: set the per-workspace token env (SLACK_<TEAM>_BOT_TOKEN, or bot_token_env under the matching [[workspaces]]). A shared global token can still be used, but only for the workspace it actually authenticates as.

Real behavior proof (redacted)

Run against disposable DB copies, workspace IDs redacted as T_A/T_B/T_C. Config has workspace_id = "T_A" (this is what triggered the bug).

1. Desktop ingest, before vs after. Same config, same desktop cache.

# pre-fix (upstream 905ac75)
$ slacrawl --config <cfg> sync --source desktop
  workspaces=1 | channels=85 | messages=1164      # only T_A ingested
  distinct workspaces in DB: 1

# this branch
$ slacrawl --config <cfg> sync --source desktop
  workspaces=3 | channels=126 | messages=1965     # T_A, T_B, T_C
  distinct workspaces in DB: 3

2. Fail-closed auth. Bot token authenticates to T_A; request T_FAKE.

$ slacrawl --config <cfg> sync --source bot --workspace T_FAKE
error: sync workspace T_FAKE: authenticated workspace T_A does not match requested workspace T_FAKE

rows labeled T_FAKE after the rejected run: 0

3. Purge scope. Copy of the three-workspace DB.

before:  T_A 1164 | T_B 526 | T_C 275

$ slacrawl purge --workspace T_B --older-than 1h --force     # scoped
after:   T_A 1164 | T_C 275                                  # only T_B removed

$ slacrawl purge --older-than 1h --force                     # no --workspace, archive-wide (preserved default)
after:   0

The hermetic tests stay in as regression cover, including TestRunSyncTargetsUsesWorkspaceTokensAndRejectsMismatchedAuth, which injects a fake Slack API and proves the mismatch rejection and that no rows get mislabeled.

go test ./... passes on Go 1.26.4.

@clawsweeper re-review

@clawsweeper

clawsweeper Bot commented Jun 23, 2026

Copy link
Copy Markdown

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

@clawsweeper clawsweeper Bot added proof: sufficient Contributor real behavior proof is sufficient. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. and removed rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. labels Jun 23, 2026
@clawsweeper clawsweeper Bot added rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. and removed rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. labels Jun 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

fix merge-risk: 🚨 auth-provider 🚨 Merging this PR could break OAuth, tokens, provider routing, model choice, or credentials. merge-risk: 🚨 compatibility 🚨 Merging this PR could break existing users, config, migrations, defaults, or upgrades. other P1 Urgent regression or broken agent/channel workflow affecting real users now. proof: sufficient Contributor real behavior proof is sufficient. rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants