Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions .jules/sentinel.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
## Sentinel's Journal

This journal documents CRITICAL security learnings found during codebase audits.
Only add entries for:
- Vulnerability patterns specific to this codebase
- Security fixes with unexpected side effects
- Rejected security changes with important constraints
- Surprising security gaps in architecture
- Reusable security patterns for this project

---
13 changes: 13 additions & 0 deletions main.py
Original file line number Diff line number Diff line change
Expand Up @@ -151,6 +151,19 @@
s = str(text)
if TOKEN and TOKEN in s:
s = s.replace(TOKEN, "[REDACTED]")

# Redact Basic Auth in URLs (e.g. https://user:pass@host)
s = re.sub(r"://[^/@]+@", "://[REDACTED]@", s)

Check warning

Code scanning / Pylint (reported by Codacy)

Variable name "s" doesn't conform to snake_case naming style Warning

Variable name "s" doesn't conform to snake_case naming style

Check warning

Code scanning / Pylintpython3 (reported by Codacy)

Variable name "s" doesn't conform to snake_case naming style Warning

Variable name "s" doesn't conform to snake_case naming style

# Redact sensitive query parameters
sensitive_keys = r"token|key|secret|password|auth|access_token|api_key"
s = re.sub(

Check warning

Code scanning / Pylint (reported by Codacy)

Variable name "s" doesn't conform to snake_case naming style Warning

Variable name "s" doesn't conform to snake_case naming style

Check warning

Code scanning / Pylintpython3 (reported by Codacy)

Variable name "s" doesn't conform to snake_case naming style Warning

Variable name "s" doesn't conform to snake_case naming style
r"([?&])(" + sensitive_keys + r")=[^&\s]+",
r"\1\2=[REDACTED]",
s,
flags=re.IGNORECASE,
)
Comment on lines +160 to +165

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

The regex for redacting sensitive query parameters should also include the fragment separator #. Sensitive data like access_token is frequently passed in the URL fragment (after the # symbol) and will not be redacted by the current implementation. Additionally, the current regex [^&\s]+ doesn't handle cases where the value is empty (e.g., ?token=&...), which could lead to incomplete redaction. Consider changing + to * to correctly redact such cases.

Suggested change
s = re.sub(
r"([?&])(" + sensitive_keys + r")=[^&\s]+",
r"\1\2=[REDACTED]",
s,
flags=re.IGNORECASE,
)
s = re.sub(
r"([?&#])(" + sensitive_keys + r")=[^&\s]+",
r"\1\2=[REDACTED]",
s,
flags=re.IGNORECASE,
)

Comment on lines +159 to +165
Copy link

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The query-parameter redaction pattern ...=[^&\s]+ will also consume trailing punctuation when a URL is followed by characters like ), ., ,, etc. in the surrounding log message (because those chars are not & or whitespace). That can unintentionally remove non-secret context from logs. Consider parsing URLs with urllib.parse and rewriting query values, or tightening the regex so it stops at common URL terminators (e.g., # and trailing punctuation) while preserving surrounding text.

Copilot uses AI. Check for mistakes.
Comment on lines +155 to +165
Copy link

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR description mentions verification with a repro_log_leak.py reproduction script, but that file isn’t included in this PR (and doesn’t appear to exist in the repository root). Either add the script to the PR (if it’s intended to be versioned) or update the PR description to avoid referring to a non-existent artifact.

Copilot uses AI. Check for mistakes.

# repr() safely escapes control characters (e.g., \n -> \\n, \x1b -> \\x1b)
# This prevents log injection and terminal hijacking.
safe = repr(s)
Expand Down
20 changes: 20 additions & 0 deletions tests/test_log_sanitization.py
Original file line number Diff line number Diff line change
Expand Up @@ -67,5 +67,25 @@
self.assertTrue(found_sanitized, "Should find sanitized name in logs")
self.assertFalse(found_raw, "Should not find raw name in logs")

def test_sanitize_for_log_redacts_credentials(self):
"""Test that sanitize_for_log redacts Basic Auth and sensitive query params."""
# Test Basic Auth
url_with_auth = "https://user:password123@example.com/folder.json"
sanitized = main.sanitize_for_log(url_with_auth)
self.assertNotIn("password123", sanitized)
self.assertIn("[REDACTED]", sanitized)

# Test Query Params
url_with_param = "https://example.com/folder.json?secret=mysecretkey"
sanitized_param = main.sanitize_for_log(url_with_param)
self.assertNotIn("mysecretkey", sanitized_param)
self.assertIn("[REDACTED]", sanitized_param)

# Test Case Insensitivity
url_with_token = "https://example.com/folder.json?TOKEN=mytoken"

Check notice

Code scanning / Bandit

Possible hardcoded password: 'https://example.com/folder.json?TOKEN=mytoken' Note test

Possible hardcoded password: 'https://example.com/folder.json?TOKEN=mytoken'

Check notice

Code scanning / Bandit (reported by Codacy)

Possible hardcoded password: 'https://example.com/folder.json?TOKEN=mytoken' Note test

Possible hardcoded password: 'https://example.com/folder.json?TOKEN=mytoken'
sanitized_token = main.sanitize_for_log(url_with_token)
self.assertNotIn("mytoken", sanitized_token)
self.assertIn("[REDACTED]", sanitized_token)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The current tests are a great start! To make them more robust, I'd suggest a few things:

  1. Make existing assertions more specific. For example, instead of self.assertIn("[REDACTED]"), you could check for the full redacted key-value pair like self.assertIn("secret=[REDACTED]").
  2. Add test cases for more scenarios, such as empty sensitive values and multiple sensitive parameters in one URL.

Here's a suggestion to add a couple more test cases:

Suggested change
self.assertIn("[REDACTED]", sanitized_token)
self.assertIn("[REDACTED]", sanitized_token)
# Test multiple sensitive params and mixed params
url_multi = "https://example.com/api?id=123&token=abc&name=user&api_key=def"
sanitized_multi = main.sanitize_for_log(url_multi)
self.assertIn("id=123", sanitized_multi)
self.assertIn("name=user", sanitized_multi)
self.assertNotIn("token=abc", sanitized_multi)
self.assertNotIn("api_key=def", sanitized_multi)
self.assertIn("token=[REDACTED]", sanitized_multi)
self.assertIn("api_key=[REDACTED]", sanitized_multi)
# Test empty sensitive param
url_empty = "https://example.com/api?token=&id=123"
sanitized_empty = main.sanitize_for_log(url_empty)
self.assertIn("token=[REDACTED]", sanitized_empty)
self.assertIn("id=123", sanitized_empty)


if __name__ == '__main__':
unittest.main()
Loading