Reject canonical duplicate bounty submission URLs#254
Conversation
weilixiong
left a comment
There was a problem hiding this comment.
Review ✅ APPROVED
QUALITY_GATE: LOW — 2f/20+, reject canonical duplicate submission URLs.
- Uses
urlunparseto normalize URLs before comparison - GITHUB_SUBMISSION_PATH_RE regex for GitHub issue/PR URL matching
- Prevents duplicate submissions with same canonical URL
- Test coverage confirmed
Bounty: MW #219 (round 7, 40 MRWK)
MolhamHamwi
left a comment
There was a problem hiding this comment.
No blockers found on this change. Evidence checked:
- Inspected
app/ledger/service.pycanonicalization and duplicate-detection path:pay_bounty()now stores canonical submission URLs, performs an exact canonical match first, and scans legacy submissions through the same canonicalizer for backward compatibility. - Checked that GitHub issue/PR URLs normalize scheme/host/repo/kind case and trailing slash while preserving query/fragment and leaving non-GitHub path semantics intact.
- Inspected
tests/test_ledger.pycoverage for current canonical duplicates, legacy mixed-case/trailing-slash duplicates, and non-GitHub trailing slash preservation. - Local validation on PR head
270b3be:uv run --python 3.12 --extra dev python -m pytest tests/test_ledger.py -q-> 19 passed;uv run --python 3.12 --extra dev ruff check app/ledger/service.py tests/test_ledger.py-> passed;uv run --python 3.12 --extra dev ruff format --check app/ledger/service.py tests/test_ledger.py-> already formatted;git diff --check origin/main...HEAD-> clean. - Hosted CI
Quality, readiness, docs, and image checksis green.
No secrets, wallet material, payout details, private deployment values, private vulnerability details, or MRWK price claims were reviewed or disclosed.
Thanhdn1984
left a comment
There was a problem hiding this comment.
Review for Bounty #219 / PR #254:
I reviewed the canonical submission URL guard and the regression coverage. The change looks sound.
Checks performed:
- Confirmed
canonical_submission_url()validates viavalidate_public_url()first, then only normalizes GitHub issue/pull URLs when host is exactlygithub.comand URL params are absent. - Confirmed normalization covers scheme/host casing, owner/repo casing,
issues|pullsegment casing, and one trailing slash viaGITHUB_SUBMISSION_PATH_RE.fullmatch(). - Confirmed non-GitHub URLs keep path semantics, including trailing slash, via
test_submission_url_canonicalization_preserves_non_github_trailing_slash. - Confirmed both new-payment and legacy-stored-payment duplicate paths are tested:
test_multi_award_bounty_rejects_canonical_duplicate_submission_urlandtest_multi_award_bounty_rejects_legacy_canonical_duplicate_submission_url.
One small note, not blocking: the PR body says query strings/fragments are preserved. They are preserved, so GitHub URLs that differ only by query/fragment still remain distinct. That matches the stated behavior, but if the product goal is strict duplicate prevention for PR/issue evidence, maintainers may later want to decide whether ?utm=... or #... variants should collapse too.
No security/privacy concerns found in the diff.
Bounty #228
Summary
Evidence
Before this change, a multi-award bounty could treat equivalent GitHub proof URLs such as
https://GITHUB.com/Ramimbo/MergeWork/PULL/12/andhttps://github.com/ramimbo/mergework/pull/12as separate submissions.Validation
uv run --extra dev python -m pytest tests/test_ledger.py -q --capture=no-> 19 passeduv run --extra dev python -m pytest -q --capture=no-> 208 passed, 2 warningsuv run --extra dev ruff check app/ledger/service.py tests/test_ledger.py-> passeduv run --extra dev ruff format --check app/ledger/service.py tests/test_ledger.py-> passeduv run --extra dev python -m mypy app-> passeduv run --extra dev python scripts/docs_smoke.py-> docs smoke okuv run --extra dev python scripts/check_agents.py-> AGENTS.md okgit diff --check-> cleanNo secrets, wallet material, private vulnerability details, deployment values, payout details, or MRWK price claims are included.