Skip to content

🛡️ Sentinel: [HIGH] Fix RTLO Spoofing in Folder Names#163

Closed
abhimehro wants to merge 1 commit intomainfrom
sentinel-rtlo-fix-15718768312849110796
Closed

🛡️ Sentinel: [HIGH] Fix RTLO Spoofing in Folder Names#163
abhimehro wants to merge 1 commit intomainfrom
sentinel-rtlo-fix-15718768312849110796

Conversation

@abhimehro
Copy link
Owner

🚨 Severity: HIGH
💡 Vulnerability: Folder names allowed Unicode Bidi control characters (e.g., RTLO), enabling Homograph/Spoofing attacks where a name like "safe\u202eexe.pdf" renders as "safepdf.exe". Also allowed path separators / and \ which could cause confusion.
🔧 Fix: Enhanced is_valid_folder_name to explicitly block Bidi control characters and path separators.
✅ Verification: Added new test cases in tests/test_folder_validation.py which pass.


PR created automatically by Jules for task 15718768312849110796 started by @abhimehro

- Block Unicode Bidi control characters (RTLO, etc.) in folder names to prevent spoofing.
- Block path separators (/ and \) to prevent potential path confusion.
- Update tests/test_folder_validation.py with new test cases.

Co-authored-by: abhimehro <84992105+abhimehro@users.noreply.github.com>
Copilot AI review requested due to automatic review settings February 5, 2026 11:02
@google-labs-jules
Copy link

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.


For security, I will only act on instructions from the user who triggered this task.

@trunk-io
Copy link

trunk-io bot commented Feb 5, 2026

Merging to main in this repository is managed by Trunk.

  • To merge this pull request, check the box to the left or comment /trunk merge below.

@github-actions github-actions bot added the python label Feb 5, 2026
@gemini-code-assist
Copy link

Summary of Changes

Hello @abhimehro, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a high-severity security vulnerability by enhancing folder name validation. It prevents homograph and spoofing attacks that leverage Unicode Bidi control characters and eliminates confusion caused by path separators in folder names. These changes significantly improve the security and integrity of folder naming conventions within the system.

Highlights

  • RTLO Spoofing Prevention: Implemented explicit blocking of Unicode Bidi control characters (U+202A-U+202E, U+2066-U+2069) in the is_valid_folder_name function to prevent homograph and spoofing attacks.
  • Path Separator Blocking: Added validation to block common path separators (/ and \) in folder names to avoid confusion and potential misuse.
  • Enhanced Test Coverage: Introduced new test cases in tests/test_folder_validation.py to verify the correct blocking of both Bidi control characters and path separators.
  • Vulnerability Documentation: A new markdown file, .jules/sentinel.md, was added to document the RTLO/Bidi Spoofing vulnerability, outlining its nature, learning points, and prevention strategy.
Changelog
  • .jules/sentinel.md
    • Added a new file documenting the RTLO/Bidi Spoofing vulnerability, including its nature, learning points, and prevention strategy.
  • main.py
    • Modified the is_valid_folder_name function to include new validation rules.
    • Updated the dangerous_chars set to explicitly block path separators (/, \).
    • Introduced a new bidi_chars set to block specific Unicode Bidi control characters (U+202A-U+202E, U+2066-U+2069).
    • Updated the validation logic to check for both dangerous_chars and bidi_chars.
  • tests/test_folder_validation.py
    • Added new test cases to test_folder_name_security to cover the updated validation logic.
    • Included tests for folder names containing path separators (/, \).
    • Added tests for folder names containing Bidi control characters (\u202e, \u202a).
Activity
  • This pull request was automatically generated by Jules for a task initiated by @abhimehro.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.


# Case 6: Path Separators
slash_data = {"group": {"group": "Folder/Name"}}
assert main.validate_folder_data(slash_data, "http://slash.com") is False

Check notice

Code scanning / Bandit

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Note test

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code.
assert main.validate_folder_data(slash_data, "http://slash.com") is False

backslash_data = {"group": {"group": "Folder\\Name"}}
assert main.validate_folder_data(backslash_data, "http://backslash.com") is False

Check notice

Code scanning / Bandit

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Note test

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code.
# Case 7: Bidi Control Characters (RTLO)
# \u202e is Right-To-Left Override
rtlo_data = {"group": {"group": "SafeName\u202eexe.pdf"}}
assert main.validate_folder_data(rtlo_data, "http://rtlo.com") is False

Check notice

Code scanning / Bandit

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Note test

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code.

# Case 8: Other Bidi Char
lre_data = {"group": {"group": "Name\u202a"}}
assert main.validate_folder_data(lre_data, "http://lre.com") is False

Check notice

Code scanning / Bandit

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Note test

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code.

# Case 6: Path Separators
slash_data = {"group": {"group": "Folder/Name"}}
assert main.validate_folder_data(slash_data, "http://slash.com") is False

Check notice

Code scanning / Bandit (reported by Codacy)

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Note test

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code.
assert main.validate_folder_data(slash_data, "http://slash.com") is False

backslash_data = {"group": {"group": "Folder\\Name"}}
assert main.validate_folder_data(backslash_data, "http://backslash.com") is False

Check notice

Code scanning / Bandit (reported by Codacy)

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Note test

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code.
# Case 7: Bidi Control Characters (RTLO)
# \u202e is Right-To-Left Override
rtlo_data = {"group": {"group": "SafeName\u202eexe.pdf"}}
assert main.validate_folder_data(rtlo_data, "http://rtlo.com") is False

Check notice

Code scanning / Bandit (reported by Codacy)

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Note test

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code.

# Case 8: Other Bidi Char
lre_data = {"group": {"group": "Name\u202a"}}
assert main.validate_folder_data(lre_data, "http://lre.com") is False

Check notice

Code scanning / Bandit (reported by Codacy)

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Note test

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code.
@@ -0,0 +1,4 @@
## 2025-02-18 - [RTLO/Bidi Spoofing in Folder Names]

Check notice

Code scanning / Remark-lint (reported by Codacy)

Warn when references to undefined definitions are found. Note

[no-undefined-references] Found reference to undefined definition
@@ -0,0 +1,4 @@
## 2025-02-18 - [RTLO/Bidi Spoofing in Folder Names]

Check notice

Code scanning / Remark-lint (reported by Codacy)

Warn when shortcut reference links are used. Note

[no-shortcut-reference-link] Use the trailing [] on reference links
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a HIGH severity security vulnerability where folder names could contain Unicode Bidi control characters (particularly RTLO - Right-to-Left Override) and path separators, enabling spoofing attacks where malicious names like "safe\u202eexe.pdf" would render as "safepdf.exe", potentially misleading users about file types.

Changes:

  • Enhanced is_valid_folder_name() validation to explicitly block 9 Unicode Bidi control characters (U+202A through U+202E and U+2066 through U+2069) and path separators (/ and )
  • Added comprehensive test cases covering path separators and multiple Bidi control characters
  • Documented the vulnerability, learning, and prevention strategy in sentinel.md

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
main.py Added blocking of path separators and Bidi control characters in folder name validation logic
tests/test_folder_validation.py Added 4 new test cases (Cases 6-8) validating rejection of path separators and Bidi characters
.jules/sentinel.md Created documentation entry describing the RTLO/Bidi spoofing vulnerability and mitigation

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +454 to +464
bidi_chars = {
"\u202a",
"\u202b",
"\u202c",
"\u202d",
"\u202e",
"\u2066",
"\u2067",
"\u2068",
"\u2069",
}
Copy link

Copilot AI Feb 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bidi_chars set is recreated on every function call, which is inefficient. Consider moving it to module level as a constant (e.g., BIDI_CONTROL_CHARS) similar to how dangerous_chars is defined inline but could also be a constant. This would improve performance especially if this validation is called frequently.

Copilot uses AI. Check for mistakes.
Comment on lines +451 to +464
# Block Bidi control characters to prevent RTLO spoofing
# \u202a (LRE), \u202b (RLE), \u202c (PDF), \u202d (LRO), \u202e (RLO)
# \u2066 (LRI), \u2067 (RLI), \u2068 (FSI), \u2069 (PDI)
bidi_chars = {
"\u202a",
"\u202b",
"\u202c",
"\u202d",
"\u202e",
"\u2066",
"\u2067",
"\u2068",
"\u2069",
}
Copy link

Copilot AI Feb 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider also blocking U+200E (Left-to-Right Mark) and U+200F (Right-to-Left Mark) for completeness, as these can also be used for text direction manipulation attacks, though they are less severe than the override/embedding characters. While LRM and RLM are less dangerous than RTLO, blocking them would provide defense-in-depth for folder names where directional marks serve no legitimate purpose.

Copilot uses AI. Check for mistakes.
Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses a high-severity spoofing vulnerability (RTLO) in folder names by blocking Unicode Bidi control characters and path separators. The changes in main.py correctly implement the validation logic, and the accompanying tests in tests/test_folder_validation.py verify the fix. My review includes suggestions to improve performance in main.py by defining constant sets at the module level, and to enhance the test suite in tests/test_folder_validation.py for better coverage and maintainability. Specifically, I've pointed out that the test coverage for the newly blocked Bidi characters is incomplete, which is critical for a security fix. I've also suggested a small refactoring to reduce code duplication in the tests.

Comment on lines +49 to +56
# Case 7: Bidi Control Characters (RTLO)
# \u202e is Right-To-Left Override
rtlo_data = {"group": {"group": "SafeName\u202eexe.pdf"}}
assert main.validate_folder_data(rtlo_data, "http://rtlo.com") is False

# Case 8: Other Bidi Char
lre_data = {"group": {"group": "Name\u202a"}}
assert main.validate_folder_data(lre_data, "http://lre.com") is False

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The test coverage for the Bidi control characters is incomplete. The fix in main.py blocks 9 different characters, but the tests only cover 2 of them (\u202e and \u202a). For a security-related change, it's crucial to ensure all blocked characters are explicitly tested to prevent regressions.

Suggested change
# Case 7: Bidi Control Characters (RTLO)
# \u202e is Right-To-Left Override
rtlo_data = {"group": {"group": "SafeName\u202eexe.pdf"}}
assert main.validate_folder_data(rtlo_data, "http://rtlo.com") is False
# Case 8: Other Bidi Char
lre_data = {"group": {"group": "Name\u202a"}}
assert main.validate_folder_data(lre_data, "http://lre.com") is False
# Case 7: Bidi Control Characters
bidi_chars_to_test = {
"\u202a", "\u202b", "\u202c", "\u202d", "\u202e",
"\u2066", "\u2067", "\u2068", "\u2069",
}
for char in bidi_chars_to_test:
data = {"group": {"group": f"SafeName{char}exploit.exe"}}
assert main.validate_folder_data(data, "http://bidi.com") is False

Comment on lines 446 to +464
dangerous_chars = set("<>\"'`")
if any(c in dangerous_chars for c in name):

# Block path separators to prevent confusion
dangerous_chars.update(["/", "\\"])

# Block Bidi control characters to prevent RTLO spoofing
# \u202a (LRE), \u202b (RLE), \u202c (PDF), \u202d (LRO), \u202e (RLO)
# \u2066 (LRI), \u2067 (RLI), \u2068 (FSI), \u2069 (PDI)
bidi_chars = {
"\u202a",
"\u202b",
"\u202c",
"\u202d",
"\u202e",
"\u2066",
"\u2067",
"\u2068",
"\u2069",
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For performance, the dangerous_chars and bidi_chars sets are being recreated on every call to is_valid_folder_name. Since this function can be called frequently, it's more efficient to define these as module-level constants to avoid the overhead of set creation on each invocation.

For example, you could define them at the top of the file and reference them in the function:

# At module level
_DANGEROUS_FOLDER_CHARS = set("<>\"'`/\\")
_BIDI_CHARS = {
    "\u202a", "\u202b", "\u202c", "\u202d", "\u202e",
    "\u2066", "\u2067", "\u2068", "\u2069",
}

def is_valid_folder_name(name: str) -> bool:
    # ... (initial checks)
    if any(c in _DANGEROUS_FOLDER_CHARS or c in _BIDI_CHARS for c in name):
        return False
    return True

Comment on lines +43 to +47
slash_data = {"group": {"group": "Folder/Name"}}
assert main.validate_folder_data(slash_data, "http://slash.com") is False

backslash_data = {"group": {"group": "Folder\\Name"}}
assert main.validate_folder_data(backslash_data, "http://backslash.com") is False

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

These two test cases for path separators are very similar. To reduce code duplication and improve maintainability, you can combine them into a single loop.

Suggested change
slash_data = {"group": {"group": "Folder/Name"}}
assert main.validate_folder_data(slash_data, "http://slash.com") is False
backslash_data = {"group": {"group": "Folder\\Name"}}
assert main.validate_folder_data(backslash_data, "http://backslash.com") is False
# Case 6: Path Separators
for separator in ("/", "\\"):
data = {"group": {"group": f"Folder{separator}Name"}}
assert main.validate_folder_data(data, "http://separator.com") is False

Copilot AI added a commit that referenced this pull request Feb 9, 2026
Security: Block Unicode Bidi control characters to prevent homograph attacks
- Block 11 Bidi control characters (LRE, RLE, PDF, LRO, RLO, LRI, RLI, FSI, PDI, LRM, RLM)
- Block path separators (/ and \) to prevent confusion
- Move dangerous character sets to module-level constants for performance
- Add comprehensive test coverage for all blocked characters
- Test characters in different positions (start, middle, end)
- Add security documentation explaining the vulnerability

Addresses feedback from PR #163

Co-authored-by: abhimehro <84992105+abhimehro@users.noreply.github.com>
abhimehro added a commit that referenced this pull request Feb 9, 2026
Security: Block Unicode Bidi control characters to prevent homograph attacks
- Block 11 Bidi control characters (LRE, RLE, PDF, LRO, RLO, LRI, RLI, FSI, PDI, LRM, RLM)
- Block path separators (/ and \) to prevent confusion
- Move dangerous character sets to module-level constants for performance
- Add comprehensive test coverage for all blocked characters
- Test characters in different positions (start, middle, end)
- Add security documentation explaining the vulnerability

Addresses feedback from PR #163

Co-authored-by: abhimehro <84992105+abhimehro@users.noreply.github.com>
@abhimehro abhimehro closed this Feb 9, 2026
@abhimehro abhimehro deleted the sentinel-rtlo-fix-15718768312849110796 branch February 9, 2026 02:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants