🛡️ Sentinel: [Security Enhancement] Harden input validation for IDs and Folder Names#151
🛡️ Sentinel: [Security Enhancement] Harden input validation for IDs and Folder Names#151
Conversation
- Add validate_resource_id for strict ID checking - Switch is_valid_folder_name to whitelist - Integrate validation in list, create, delete workflows - Add tests/test_id_validation.py Co-authored-by: abhimehro <84992105+abhimehro@users.noreply.github.com>
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
|
Merging to
|
Summary of ChangesHello @abhimehro, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the security posture of the Sentinel application by implementing robust input validation across various critical identifiers and user-defined names. It introduces a unified approach for validating resource IDs, strengthens folder name validation with a strict whitelist, and incorporates defense-in-depth mechanisms to verify IDs received from API responses. These changes collectively mitigate risks associated with injection attacks and ensure data integrity throughout the system. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Bandit found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.
There was a problem hiding this comment.
Bandit (reported by Codacy) found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.
There was a problem hiding this comment.
Pylint (reported by Codacy) found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.
There was a problem hiding this comment.
Pylintpython3 (reported by Codacy) found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.
|
|
||
| if len(resource_id) > 64: | ||
| if log_errors: | ||
| log.error(f"Invalid {type_name} length (max 64 chars)") |
Check warning
Code scanning / Prospector (reported by Codacy)
Use lazy % formatting in logging functions (logging-fstring-interpolation) Warning
| log.error("Invalid profile ID format (contains unsafe characters)") | ||
| elif len(profile_id) > 64: | ||
| log.error("Invalid profile ID length (max 64 chars)") | ||
| log.error(f"Invalid {type_name} format (contains unsafe characters)") |
Check warning
Code scanning / Prospector (reported by Codacy)
Use lazy % formatting in logging functions (logging-fstring-interpolation) Warning
| return str(pk) | ||
| if validate_resource_id(str(pk), "New Folder PK"): | ||
| log.info( | ||
| "Created folder %s (ID %s) [Direct]", sanitize_for_log(name), pk |
Check warning
Code scanning / Prospector (reported by Codacy)
Use lazy % formatting in logging functions (logging-fstring-interpolation) Warning
| @@ -819,12 +854,19 @@ def create_folder( | |||
|
|
|||
| for grp in groups: | |||
Check warning
Code scanning / Prospector (reported by Codacy)
Unnecessary "else" after "return" (no-else-return) Warning
| ) | ||
| return None | ||
| except Exception as e: | ||
| log.warning(f"Error fetching groups on attempt {attempt}: {e}") |
Check warning
Code scanning / Prospector (reported by Codacy)
Unnecessary "else" after "return" (no-else-return) Warning
main.py
Outdated
Check warning
Code scanning / Prospector (reported by Codacy)
Unnecessary "else" after "return" (no-else-return) Warning
| @@ -0,0 +1,44 @@ | |||
| from unittest.mock import MagicMock | |||
Check warning
Code scanning / Prospector (reported by Codacy)
Unused MagicMock imported from unittest.mock (unused-import) Warning test
There was a problem hiding this comment.
Code Review
This pull request significantly improves the security of the application by hardening input validation for resource IDs and folder names, adopting a whitelist approach, and adding defense-in-depth checks on API responses. While these are excellent practices, a subtle bypass exists in the resource ID validation logic due to the behavior of the $ anchor in Python's regex engine, which allows trailing newlines. This vulnerability requires remediation to ensure the defense-in-depth measures are robust. Additionally, I've provided a few suggestions for further code improvement, including a type hint correction, reducing code duplication, and cleaning up an unused import.
|
|
||
| def validate_profile_id(profile_id: str, log_errors: bool = True) -> bool: | ||
| if not is_valid_profile_id_format(profile_id): | ||
| if not re.match(r"^[a-zA-Z0-9_-]+$", resource_id): |
There was a problem hiding this comment.
The regular expression used for validation uses the $ anchor, which in Python's re module matches the end of the string or the position just before a newline at the end of the string. This allows an ID with a trailing newline (e.g., "abc\n") to pass validation, even though the newline character is not in the allowed character set [a-zA-Z0-9_-]. Since these IDs are used to construct API URLs (e.g., in delete_folder and list_existing_folders), an ID with a newline could lead to CRLF injection or malformed HTTP requests. It is recommended to use re.fullmatch() to ensure the entire string matches the allowed character set.
| if not re.match(r"^[a-zA-Z0-9_-]+$", resource_id): | |
| if not re.fullmatch(r"[a-zA-Z0-9_-]+", resource_id): |
| def validate_resource_id( | ||
| resource_id: str, type_name: str = "ID", log_errors: bool = True | ||
| ) -> bool: |
There was a problem hiding this comment.
The type hint for resource_id is str, but the function's logic correctly handles None and the new tests pass None as a valid input. To align the function signature with its implementation and tests, and to satisfy static type checkers, the type hint should be changed to Optional[str]. The Optional type is already imported in this file.
| def validate_resource_id( | |
| resource_id: str, type_name: str = "ID", log_errors: bool = True | |
| ) -> bool: | |
| def validate_resource_id( | |
| resource_id: Optional[str], type_name: str = "ID", log_errors: bool = True | |
| ) -> bool: |
| if isinstance(body, dict) and "group" in body and "PK" in body["group"]: | ||
| pk = body["group"]["PK"] | ||
| log.info( | ||
| "Created folder %s (ID %s) [Direct]", sanitize_for_log(name), pk | ||
| ) | ||
| return str(pk) | ||
| if validate_resource_id(str(pk), "New Folder PK"): | ||
| log.info( | ||
| "Created folder %s (ID %s) [Direct]", sanitize_for_log(name), pk | ||
| ) | ||
| return str(pk) | ||
| else: | ||
| log.error( | ||
| f"API returned unsafe PK for folder {sanitize_for_log(name)}: {sanitize_for_log(pk)}" | ||
| ) | ||
| return None |
There was a problem hiding this comment.
This block of code for validating the returned PK is substantially repeated twice more within this function (for the list-based response and the polling fallback). This duplication harms maintainability. Consider extracting this logic into a nested helper function to reduce repetition.
For example:
def _validate_and_log_pk(pk: Any, source: str) -> Optional[str]:
pk_str = str(pk)
if validate_resource_id(pk_str, f"{source} Folder PK"):
log.info(
"Created folder %s (ID %s) [%s]", sanitize_for_log(name), pk, source
)
return pk_str
else:
log.error(
f"API returned unsafe PK for folder {sanitize_for_log(name)}: {sanitize_for_log(pk)}"
)
return NoneThis helper could then be called from all three locations, simplifying the main function body.
| @@ -0,0 +1,44 @@ | |||
| from unittest.mock import MagicMock | |||
There was a problem hiding this comment.
Pull request overview
This PR hardens input validation around profile and folder identifiers and folder names, and wires those checks into folder-related API operations for better security and robustness.
Changes:
- Introduces a shared
validate_resource_idhelper used byvalidate_profile_idand folder operations to enforce a strict character whitelist and max length on IDs/PKs. - Tightens
is_valid_folder_namefrom a blacklist to a whitelist with a 64-character limit to reduce XSS/injection risk. - Integrates ID validation into
list_existing_folders,create_folder, anddelete_folderso that unsafe folder IDs from inputs or API responses are rejected and not acted upon, and adds tests validating the new ID and folder-name rules.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| main.py | Adds validate_resource_id, updates validate_profile_id, strengthens is_valid_folder_name, and validates folder PKs/IDs in list_existing_folders, create_folder, and delete_folder before using them. |
| tests/test_id_validation.py | Adds unit tests for validate_resource_id (valid/invalid formats, length, empty/None) and is_valid_folder_name (whitelist behavior and length limit). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| @@ -0,0 +1,44 @@ | |||
| from unittest.mock import MagicMock | |||
There was a problem hiding this comment.
MagicMock is imported but never used in this test module, which adds noise and may confuse readers about intended mocking. Consider removing the unused import to keep the test file minimal and focused.
| from unittest.mock import MagicMock |
| if len(resource_id) > 64: | ||
| if log_errors: | ||
| log.error(f"Invalid {type_name} length (max 64 chars)") | ||
| return False | ||
|
|
||
| def validate_profile_id(profile_id: str, log_errors: bool = True) -> bool: | ||
| if not is_valid_profile_id_format(profile_id): | ||
| if not re.match(r"^[a-zA-Z0-9_-]+$", resource_id): |
There was a problem hiding this comment.
validate_resource_id assumes resource_id is a string, but it is called with raw PK values from JSON in list_existing_folders, so if the API ever returns a non-string (e.g., numeric PK), this len()/re.match() path will raise a TypeError instead of cleanly rejecting the ID. To make this defensive validator robust against unexpected types, consider coercing resource_id to str (or explicitly checking isinstance(resource_id, str) and returning False for non-strings) before the length and regex checks.
| result = {} | ||
| for f in folders: | ||
| name = f.get("group") | ||
| pk = f.get("PK") | ||
| if name and pk: | ||
| if validate_resource_id(pk, "Folder PK", log_errors=False): | ||
| result[name.strip()] = pk | ||
| else: | ||
| log.warning( | ||
| f"Skipping folder '{sanitize_for_log(name)}' with unsafe PK: {sanitize_for_log(pk)}" | ||
| ) |
There was a problem hiding this comment.
The new defense-in-depth logic in list_existing_folders that filters out folders with unsafe PKs and logs a warning is not covered by tests; current tests only exercise validate_resource_id in isolation. Given the security focus of this change and existing coverage for other main.py behaviors, consider adding tests that simulate folder lists containing valid and invalid PKs to verify that valid entries are kept, invalid ones are skipped, and the appropriate log warnings are emitted.
| if not validate_resource_id(folder_id, "Folder ID to delete"): | ||
| log.error(f"Aborting deletion of {sanitize_for_log(name)}: Unsafe Folder ID") | ||
| return False |
There was a problem hiding this comment.
The new pre-validation in delete_folder that aborts when folder_id fails validate_resource_id is not covered by tests, so regressions here could go unnoticed (e.g., if validation is accidentally relaxed or removed). It would be helpful to add tests that call delete_folder with both valid and invalid folder IDs, asserting that invalid IDs trigger the early False return and log the expected error without invoking _api_delete.
| if validate_resource_id(str(pk), "New Folder PK"): | ||
| log.info( | ||
| "Created folder %s (ID %s) [Direct]", sanitize_for_log(name), pk | ||
| ) | ||
| return str(pk) |
There was a problem hiding this comment.
The new checks in create_folder that validate the returned PK and either return it or log an "unsafe PK" error (here and in the similar branches below for list-based and polled responses) are not covered by tests. To guard this critical security behavior, consider adding tests that exercise responses with valid PKs, responses with malformed PKs, and the polling fallback, asserting both the returned ID/None and the corresponding log messages.
…itization, add dry-run plan details Incorporates the best changes from 36 Jules PRs, addressing review feedback: Bolt (Performance) - from PR #173: - Pre-compile PROFILE_ID_PATTERN and RULE_PATTERN at module level - Use compiled patterns in is_valid_profile_id_format, validate_profile_id, and is_valid_rule - Supersedes PRs: #140, #143, #152, #155, #158, #161, #167, #170, #173 Sentinel (Security) - from PR #172 with review feedback: - Enhance sanitize_for_log to redact Basic Auth credentials in URLs - Redact sensitive query parameters (token, key, secret, password, etc.) - Handle fragment separators (#) per Gemini Code Assist review - Use [^&#\s]* pattern per Copilot reviewer suggestion - Update docstring per reviewer suggestion - Supersedes PRs: #142, #145, #148, #151, #154, #157, #160, #169, #172 Palette (UX) - from PR #174 with lint fixes: - Add print_plan_details function for dry-run visibility - Fix duplicate render_progress_bar definition bug - Supersedes PRs: #139, #141, #144, #147, #150, #153, #156, #159, #162, #165, #168, #171, #174 Also: #146, #149, #164 (parallel folder deletion) and #166 (auto-fix .env perms) are independent features not consolidated here. Co-authored-by: abhimehro <84992105+abhimehro@users.noreply.github.com>
…itization, add dry-run plan details Incorporates the best changes from 36 Jules PRs, addressing review feedback: Bolt (Performance) - from PR #173: - Pre-compile PROFILE_ID_PATTERN and RULE_PATTERN at module level - Use compiled patterns in is_valid_profile_id_format, validate_profile_id, and is_valid_rule - Supersedes PRs: #140, #143, #152, #155, #158, #161, #167, #170, #173 Sentinel (Security) - from PR #172 with review feedback: - Enhance sanitize_for_log to redact Basic Auth credentials in URLs - Redact sensitive query parameters (token, key, secret, password, etc.) - Handle fragment separators (#) per Gemini Code Assist review - Use [^&#\s]* pattern per Copilot reviewer suggestion - Update docstring per reviewer suggestion - Supersedes PRs: #142, #145, #148, #151, #154, #157, #160, #169, #172 Palette (UX) - from PR #174 with lint fixes: - Add print_plan_details function for dry-run visibility - Fix duplicate render_progress_bar definition bug - Supersedes PRs: #139, #141, #144, #147, #150, #153, #156, #159, #162, #165, #168, #171, #174 Also: #146, #149, #164 (parallel folder deletion) and #166 (auto-fix .env perms) are independent features not consolidated here. Co-authored-by: abhimehro <84992105+abhimehro@users.noreply.github.com>
🛡️ Sentinel: [Security Enhancement] Harden input validation for IDs and Folder Names
Enhancements:
validate_resource_idto enforce a strict whitelist (^[a-zA-Z0-9_-]+$) and length limit (64 chars) for Profile IDs and Folder IDs (PKs).list_existing_foldersandcreate_folderto ensure the application does not process or trust malformed IDs returned by the API (Defense in Depth).is_valid_folder_namefrom a blacklist (blocking<>) to a strict whitelist (^[a-zA-Z0-9 \-_.()\[\]{}]+$) and added a length limit. This prevents XSS and other injection attacks.folder_idindelete_folderbefore using it in API calls.Verification:
tests/test_id_validation.pycovering new logic.uv run python -m pytest), all tests passed.PR created automatically by Jules for task 6082117876812247905 started by @abhimehro