fix: resolve prompt injection detector homoglyph bypass (#102) by MayurKharat0390 · Pull Request #119 · sreerevanth/AgentWatch

MayurKharat0390 · 2026-06-02T04:22:34Z

Problem

The scan_text function in agentwatch/core/injection.py relies on unicodedata.normalize("NFKC", text) under the assumption that it translates visual homoglyphs (such as Cyrillic or Greek lookalikes) to their canonical ASCII equivalents before the regex patterns are evaluated.

However, Unicode NFKC normalization never translates visual confusables across different scripts (e.g. mapping Cyrillic small letter о (U+043E) to Latin small letter o (U+006F)).

Since the injection detector's patterns are Latin ASCII-only, an attacker can bypass all prompt injection signatures by replacing Latin letters with visually identical Greek or Cyrillic homoglyphs.

Solution

Introduced a translation map _HOMOGLYPH_MAP inside agentwatch/core/injection.py that maps visually identical/confusable Greek, Cyrillic, and Latin Extended characters to standard Latin ASCII.
Updated _normalize(text) to map these lookalike characters to their Latin counterparts after NFKC normalization.

Testing & Verification

Added a dedicated unit test test_injection_detector_homoglyph_bypass_prevention in tests/test_safety.py to assert that visual lookalike payloads spelling out "ignore previous instructions", "reveal your prompt", and "new instructions:" are successfully blocked.
All formatting check rules passed via ruff and all 265 unit/integration tests pass perfectly.

Summary by CodeRabbit

Bug Fixes
- Improved prompt-injection detection to catch attempts that use Unicode lookalike characters and visual homoglyphs from multiple scripts, reducing bypass risk.
Tests
- Added automated tests validating detection of injection attempts built with homoglyph and Unicode lookalike substitution patterns to ensure continued robustness.

coderabbitai · 2026-06-02T04:22:44Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 558823c5-22f3-4369-afda-24aa1aa93c57

📥 Commits

Reviewing files that changed from the base of the PR and between 0ec813f and 1c84594.

📒 Files selected for processing (2)

agentwatch/core/injection.py
tests/test_safety.py

🚧 Files skipped from review as they are similar to previous changes (1)

tests/test_safety.py

📝 Walkthrough

Walkthrough

This PR strengthens the prompt-injection detector by adding homoglyph-to-ASCII translation to its text normalization pipeline. A static character mapping translates visual lookalikes (Cyrillic/Greek/Latin variants) to ASCII equivalents, and the _normalize() function applies this after NFKC normalization. A new test validates detection of multi-homoglyph injection payloads.

Changes

Homoglyph normalization for injection detection

Layer / File(s)	Summary
Homoglyph mapping and normalization logic `agentwatch/core/injection.py`	`_HOMOGLYPH_MAP` (lines 30–92) maps Cyrillic, Greek, and Latin visual lookalikes to ASCII; `_normalize()` (lines 96–99) applies NFKC normalization then character replacement to neutralize lookalike substitutions.
Homoglyph bypass detection test `tests/test_safety.py`	`test_injection_detector_homoglyph_bypass_prevention()` (lines 484–507) verifies `scan_text` detects injection payloads using multiple homoglyph patterns (Cyrillic а, Greek α, dotless-i) and allows benign instructions.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related issues

sreerevanth/AgentWatch#102: Directly addresses homoglyph-based injection bypass by implementing explicit visual-lookalike detection and mapping in the injection detector.

Possibly related PRs

sreerevanth/AgentWatch#66: Both PRs modify the injection detector's text normalization pipeline; PR #66 introduced NFKC normalization and bidi-control handling, and this PR adds homoglyph mapping as a downstream normalization step.

Suggested labels

security, bug, level: advanced, level3

Poem

🐰 I hop through text both near and far,
spotting twins that look like "a" or "α",
I map their masks back to plain ASCII,
chase sly homoglyphs out of the sky,
and nibble bad inputs—safe as pie.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title directly and accurately summarizes the main change: fixing a prompt injection detector bypass vulnerability caused by homoglyph substitution attacks.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@agentwatch/core/injection.py`:
- Around line 30-92: _HOMOGLYPH_MAP is missing the Cyrillic small letter "м"
(U+043C), allowing strings like "proмpt" to bypass normalization; add the
mapping "\u043c": "m" to the _HOMOGLYPH_MAP dictionary so lowercase Cyrillic м
is normalized to Latin 'm' (update the existing _HOMOGLYPH_MAP definition in
agentwatch/core/injection.py).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 613daebc-03ec-435f-b1be-7b76b4a6f779

📥 Commits

Reviewing files that changed from the base of the PR and between 46cd2fa and 0ec813f.

📒 Files selected for processing (2)

agentwatch/core/injection.py
tests/test_safety.py

sreerevanth · 2026-06-02T06:40:11Z

@MayurKharat0390 Thanks for the contribution — this addresses a real security gap and the added regression test is appreciated.

Before merge, could you please take a look at the remaining CodeRabbit finding regarding Cyrillic lowercase м (U+043C)? Since this PR is specifically focused on homoglyph bypass prevention, I'd like to make sure we aren't leaving an obvious bypass path uncovered.

Once that's addressed, this should be ready for merge. 🚀

)

MayurKharat0390 · 2026-06-02T13:53:38Z

Hey @sreerevanth!

I've addressed the CodeRabbit review finding by adding the mapping for Cyrillic lowercase м (\u043c -> m) to _HOMOGLYPH_MAP inside agentwatch/core/injection.py.

I also added a corresponding regression test case (reveal your proмpt) under test_injection_detector_homoglyph_bypass_prevention to verify that visual homoglyph bypasses using small м are correctly normalized and detected.

All formatting rules and unit tests pass with complete success. It's ready for another look! 🚀

MayurKharat0390 mentioned this pull request Jun 2, 2026

[BUG/SECURITY] : Security Bypass of Prompt Injection Detector via Homoglyphs #102

Open

coderabbitai Bot reviewed Jun 2, 2026

View reviewed changes

Comment thread agentwatch/core/injection.py

fix: resolve prompt injection detector homoglyph bypass (sreerevanth#102

1c84594

)

MayurKharat0390 force-pushed the fix/homoglyph-bypass branch from 0ec813f to 1c84594 Compare June 2, 2026 13:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: resolve prompt injection detector homoglyph bypass (#102)#119

fix: resolve prompt injection detector homoglyph bypass (#102)#119
MayurKharat0390 wants to merge 1 commit into
sreerevanth:mainfrom
MayurKharat0390:fix/homoglyph-bypass

MayurKharat0390 commented Jun 2, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 2, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Possibly related PRs

Suggested labels

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

sreerevanth commented Jun 2, 2026

Uh oh!

MayurKharat0390 commented Jun 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

MayurKharat0390 commented Jun 2, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Solution

Testing & Verification

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Possibly related PRs

Suggested labels

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sreerevanth commented Jun 2, 2026

Uh oh!

MayurKharat0390 commented Jun 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

MayurKharat0390 commented Jun 2, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 2, 2026 •

edited

Loading