Skip to content

Strip italic, strikethrough and inline code in strip_markdown#5001

Open
Shubb07 wants to merge 1 commit into
OWASP:mainfrom
Shubb07:fix/strip-markdown-formatting
Open

Strip italic, strikethrough and inline code in strip_markdown#5001
Shubb07 wants to merge 1 commit into
OWASP:mainfrom
Shubb07:fix/strip-markdown-formatting

Conversation

@Shubb07

@Shubb07 Shubb07 commented Jun 21, 2026

Copy link
Copy Markdown
Contributor

Proposed change

Resolves #4973

strip_markdown() only stripped bold (*) and rewrote links, so _italic_, ~strike~ and inline `code` markers leaked into the plain-text fallback. This extends it to strip those markers too.

Stripping is limited to paired markers wrapping non-space content, so underscores and similar characters that are part of words or URLs (e.g. snake_case) are left untouched. Added test cases for italic, strikethrough, inline code, a combined case, and a snake_case + URL regression check.

Checklist

  • Required: I followed the contributing workflow
  • Required: I verified that my code works as intended and resolves the issue as described
  • Required: I ran make check-test locally: all warnings addressed, tests passed
  • I used AI for code, documentation, tests, or communication related to this PR

@coderabbitai

coderabbitai Bot commented Jun 21, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: db7b8bd9-c451-4bf8-8661-674e05999c9a

📥 Commits

Reviewing files that changed from the base of the PR and between 0fc307a and 0382e22.

📒 Files selected for processing (2)
  • backend/apps/slack/common/text.py
  • backend/tests/unit/apps/slack/utils_test.py

Summary by CodeRabbit

  • Bug Fixes
    • Improved text cleanup for Slack-style inline formatting by stripping italics (_..._), strikethrough (~...~), and inline code (`...`), including mixed-format strings.
    • Link labels are now converted more reliably into text (url) form, while preserving underscores in labels.
    • Prevented accidental removal of formatting characters when they’re part of words or URLs.
  • Tests
    • Expanded coverage for additional markdown-stripping cases, including the new link-label scenario.

Walkthrough

strip_markdown() now removes Slack link markup plus bold, italic, strikethrough, and inline code markers. Unit tests add coverage for each marker type, mixed formatting, and a link label containing snake_case.

Changes

strip_markdown formatting fix

Layer / File(s) Summary
strip_markdown implementation and tests
backend/apps/slack/common/text.py, backend/tests/unit/apps/slack/utils_test.py
strip_markdown() now substitutes Slack links back into text and strips *, _, ~, and ` markers with boundary-aware regex replacements; tests add cases for the new markers, mixed formatting, and snake_case link labels.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title matches the main change: extending strip_markdown to remove italic, strikethrough, and inline code markers.
Description check ✅ Passed The description accurately describes the markdown-stripping fix and added regression tests.
Linked Issues check ✅ Passed The PR satisfies #4973 by stripping italic, strikethrough, and inline code markers and adding tests for those cases.
Out of Scope Changes check ✅ Passed The added snake_case and mixed-format test cases are directly related to the markdown-stripping fix.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

coderabbitai[bot]
coderabbitai Bot previously approved these changes Jun 21, 2026
cubic-dev-ai[bot]
cubic-dev-ai Bot previously approved these changes Jun 21, 2026

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 2 files

Confidence score: 5/5

  • Automated review surfaced no issues in the provided summaries.
  • No files require special attention.

Re-trigger cubic

@Shubb07 Shubb07 marked this pull request as ready for review June 21, 2026 09:52
@Shubb07 Shubb07 requested review from arkid15r and kasya as code owners June 21, 2026 09:52
@Shubb07 Shubb07 force-pushed the fix/strip-markdown-formatting branch from 0fdef1a to 0fc307a Compare June 26, 2026 06:09

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@backend/apps/slack/common/text.py`:
- Around line 54-57: The marker stripping in the Slack text cleanup is too broad
and can remove balanced underscores inside expanded URLs, which breaks the
plain-text fallback produced by get_text. Update the marker handling in the
text-processing flow around the marker regex/substitution so it only strips
markers outside URLs, or tighten the underscore matcher in this pass to require
whitespace/start/end delimiters instead of a generic non-word boundary. Keep the
fix localized to the logic that expands links before the marker pass and the
regex substitution used for markers.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 289c37ac-e35c-43fd-b315-e971669d3986

📥 Commits

Reviewing files that changed from the base of the PR and between 0fdef1a and 0fc307a.

📒 Files selected for processing (2)
  • backend/apps/slack/common/text.py
  • backend/tests/unit/apps/slack/utils_test.py

Comment thread backend/src/apps/slack/common/text.py
coderabbitai[bot]
coderabbitai Bot previously approved these changes Jun 26, 2026
cubic-dev-ai[bot]
cubic-dev-ai Bot previously approved these changes Jun 26, 2026

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0 issues found across 2 files (changes from recent commits).

Re-trigger cubic

Signed-off-by: shuban <shubanmutagi55@gmail.com>
@Shubb07 Shubb07 force-pushed the fix/strip-markdown-formatting branch from 0382e22 to 3470dea Compare June 29, 2026 06:15
@sonarqubecloud

Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

strip_markdown() only removes bold, leaving italic/strikethrough/code markers in the plain text output

1 participant