Skip to content

feat(#144): add ContentDedupGate to prevent duplicate outbound posts#181

Open
yclaw-agent-orchestrator[bot] wants to merge 19 commits intomainfrom
feat/issue-144
Open

feat(#144): add ContentDedupGate to prevent duplicate outbound posts#181
yclaw-agent-orchestrator[bot] wants to merge 19 commits intomainfrom
feat/issue-144

Conversation

@yclaw-agent-orchestrator
Copy link
Copy Markdown
Contributor

Summary

Fixes the duplicate posting bug (#144) where Ember posted near-identical content simultaneously on X/Twitter. Adds a dedup gate that runs before any outbound action to block exact and near-duplicate content.

  • NEW packages/core/src/review/content-dedup.tsContentDedupGate class with SHA-256 exact-match + Jaccard character-bigram fuzzy similarity (>85% threshold), 12-hour TTL in-memory Map, scoped per platform so the same content can legitimately appear on different platforms (e.g. Twitter + Telegram)
  • NEW packages/core/tests/content-dedup.test.ts — 7 tests covering all acceptance criteria
  • MODIFY packages/core/src/review/outbound-safety.ts — integrates ContentDedupGate into OutboundSafetyGate.check(), adds recordOutboundContent() public method, extends blocked_by union with 'dedup'
  • MODIFY packages/core/src/agent/executor.ts — calls outboundSafety.recordOutboundContent() after every successful outbound action to keep the dedup window current

Acceptance criteria

  • Exact duplicate content is blocked (SHA-256 hash match)
  • Near-duplicate content (>85% Jaccard bigram similarity) is blocked
  • Records auto-expire after 12 hours
  • Fail-open: gate errors log a warning and do NOT block posts
  • All 7 tests pass
  • Per-platform scope: same content on different platforms is allowed

Test plan

  • Run vitest — all 7 tests in content-dedup.test.ts should pass
  • Confirm no regressions in existing outbound-safety tests

Closes #144

🤖 Generated with Claude Code

YClaw Agent Orchestrator and others added 12 commits April 19, 2026 04:09
Automated changes for issue #179.

Closes #179

Co-authored-by: YClaw Agent Orchestrator <agents@example.com>
Adds a dedup gate to the outbound publish pipeline to block Ember (and
other agents) from posting identical or near-identical content to the
same platform in quick succession — the root cause of the duplicate
Twitter posts reported in #144.

Changes:
- NEW packages/core/src/review/content-dedup.ts — ContentDedupGate
  with SHA-256 exact-match check and Jaccard character-bigram fuzzy
  matching (>85% threshold), 12h TTL in-memory Map, per-platform scope,
  and fail-open error handling.
- NEW packages/core/tests/content-dedup.test.ts — 7 test cases covering
  all acceptance criteria (exact block, fuzzy block, TTL expiry,
  cross-platform allow, fail-open, allow dissimilar content).
- MODIFY packages/core/src/review/outbound-safety.ts — integrate
  ContentDedupGate into OutboundSafetyGate.check(); add
  recordOutboundContent(); extend blocked_by union with 'dedup'.
- MODIFY packages/core/src/agent/executor.ts — call
  outboundSafety.recordOutboundContent() after every successful
  outbound action so the dedup window stays current.

Closes #144

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@DannyDesert DannyDesert enabled auto-merge (squash) April 22, 2026 18:40
@yclaw-agent-orchestrator yclaw-agent-orchestrator Bot added the human-review-required PR touches sensitive paths — human must review label Apr 22, 2026
@DannyDesert DannyDesert enabled auto-merge (squash) April 22, 2026 19:00
@yclaw-agent-orchestrator
Copy link
Copy Markdown
Contributor Author

Architect Audit — PR #181: ContentDedupGate

Status: Advisory audit (VELOCITY MODE — PR auto-merges on CI pass)

Summary

This PR implements the dedup gate for issue #144. The core logic is correct and the fail-open design is appropriate. However, I've identified a structural concern that warrants a follow-up issue.


✅ What's Good

  • ContentDedupGate in content-dedup.ts: clean SHA-256 exact + Jaccard bigram fuzzy check, per-platform scoping, 12h TTL, fail-open on errors. Matches the spec exactly.
  • OutboundSafetyGate integration: dedup check is correctly placed after deterministic checks, recordOutboundContent is called post-success in executor.
  • Test coverage: content-dedup.test.ts (7 cases) + twitter-dedup.test.ts (comprehensive). Good edge case coverage.
  • All 5 acceptance criteria from the issue spec appear to be met.

⚠️ P2 — Dual Dedup Implementations (follow-up issue)

This PR introduces two separate dedup systems:

  1. packages/core/src/review/content-dedup.ts — in-memory, bigram-based, integrated into OutboundSafetyGate (applies to all outbound platforms)
  2. packages/core/src/actions/twitter-dedup.ts — Redis-backed, word-level Jaccard, Twitter-specific

The twitter-dedup.ts module appears to be a holdover from the earlier Twitter-specific approach (PR #152 direction). It's now redundant given the platform-agnostic ContentDedupGate. Having two dedup systems creates:

  • Inconsistent thresholds (85% bigram vs 90% word-level Jaccard)
  • Dual maintenance burden
  • Potential confusion about which gate is authoritative

Recommendation: After this PR merges, create a follow-up issue to remove twitter-dedup.ts and consolidate on ContentDedupGate. Not blocking — the in-memory gate is the one actually wired into the safety pipeline.


P3 — Nit: twitter-dedup.ts is registered but may not be called

packages/core/src/bootstrap/actions.ts was modified (+15/-12). Worth verifying in the follow-up that twitter-dedup action registration doesn't create a dead code path now that ContentDedupGate handles dedup at the safety layer.


Merge Blocker Investigation

PR is mergeable_state: blocked with human-review-required label. This label was likely applied by the agent-safety.yml workflow detecting the outbound-safety.ts modification (IMMUTABLE file).

This is a legitimate safety gate. The outbound-safety.ts file is marked IMMUTABLE — changes require human review. A human maintainer needs to:

  1. Review the outbound-safety.ts diff (adds ContentDedupGate integration — ~56 lines, no removal of existing checks)
  2. Remove the human-review-required label to unblock auto-merge

The changes to outbound-safety.ts are additive and correct — the dedup check is placed after deterministic checks and is fail-open. No existing safety logic was removed or weakened.


Follow-up issues to create: twitter-dedup.ts consolidation (P2)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

human-review-required PR touches sensitive paths — human must review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: Ember duplicate posting — near-identical content posted simultaneously

0 participants