Skip to content

fix(config): prevent .pr_agent.toml state leak between repositories#2346

Open
tharatynowicz wants to merge 2 commits intoThe-PR-Agent:mainfrom
tharatynowicz:fix/apply-repo-settings-state-leak
Open

fix(config): prevent .pr_agent.toml state leak between repositories#2346
tharatynowicz wants to merge 2 commits intoThe-PR-Agent:mainfrom
tharatynowicz:fix/apply-repo-settings-state-leak

Conversation

@tharatynowicz
Copy link
Copy Markdown

Fixes #2345.

Summary

apply_repo_settings() mutates the module-level global_settings singleton without resetting it first. When a PR comes from a repo without .pr_agent.toml, get_repo_settings() returns b"", the loader early-exits on if repo_settings:, and whatever was injected by a previous repo's .pr_agent.toml lingers in global_settings for the rest of the process. Details, root-cause analysis and a self-contained reproducer are in #2345.

Approach

  • Add reset_to_startup_defaults() in config_loader.py — lazy snapshot on first call (so any post-import startup, e.g. apply_secrets_manager_config(), is captured), full restore on subsequent calls.
  • Call it at the top of apply_repo_settings() so every per-repo load starts from a clean state.
  • Additive change: no existing call sites modified, no signatures changed, no behaviour change for repos that do have a .pr_agent.toml (other than their settings no longer carrying leftovers from a previous repo's load).

Alternative I considered was a larger per-request isolation refactor (clone global_settings per request via context["settings"] set by a FastAPI middleware). Happy to go that route instead if preferred — this PR takes the minimal path.

Tests

  • Added tests/unittest/test_apply_repo_settings.py with two tests: a sanity test that .pr_agent.toml instructions are applied, and a regression test that a repo without .pr_agent.toml does not inherit the previous repo's extra_instructions.
  • The regression test fails on main without this fix and passes with it.
  • Full existing unit suite on main + this change: 296 passed, 0 failed (the 2 new tests included).

Diff: +147 / −1 across 3 files (fix: +36/−1 in 2 files; test: +111/−0 in 1 new file).

apply_repo_settings() mutates the module-level global_settings singleton
without resetting it first. When a subsequent PR comes from a repo with
no .pr_agent.toml (get_repo_settings() returns b""), the loader early-
exits on `if repo_settings:` and sections set by a previous repo's
.pr_agent.toml persist for the lifetime of the process.

Snapshot global_settings on first call to reset_to_startup_defaults()
(lazy, so apply_secrets_manager_config() and any other post-import
startup work is captured). Restore at the top of apply_repo_settings()
before loading per-repo config.

Fixes The-PR-Agent#2345
Two tests in tests/unittest/test_apply_repo_settings.py:

- test_repo_settings_from_toml_are_applied: sanity — .pr_agent.toml
  extra_instructions are loaded when present.
- test_repo_without_toml_does_not_inherit_previous_repo_settings:
  regression — after apply_repo_settings() loads a repo with a
  .pr_agent.toml, a subsequent call for a repo without .pr_agent.toml
  must not still carry the previous repo's extra_instructions.

Both pass with the fix in the preceding commit. The regression test
fails on main without the fix.
@qodo-free-for-open-source-projects
Copy link
Copy Markdown
Contributor

Review Summary by Qodo

Prevent .pr_agent.toml state leak between repositories

🐞 Bug fix

Grey Divider

Walkthroughs

Description
• Prevent .pr_agent.toml configuration state leaking between repositories
• Add reset_to_startup_defaults() function to restore global settings
• Call reset at start of apply_repo_settings() for clean per-repo loads
• Add regression tests verifying settings isolation between repos
Diagram
flowchart LR
  A["apply_repo_settings called"] --> B["reset_to_startup_defaults"]
  B --> C["Restore global_settings snapshot"]
  C --> D["Load repo-specific config"]
  D --> E["Clean state for next repo"]
Loading

Grey Divider

File Changes

1. pr_agent/config_loader.py 🐞 Bug fix +32/-0

Add state reset mechanism for global settings

• Import copy module for deep copying settings
• Add _STARTUP_SETTINGS_SNAPSHOT module-level variable to store initial state
• Implement reset_to_startup_defaults() function with lazy snapshot on first call
• Restore global_settings by removing injected sections and restoring pre-existing ones

pr_agent/config_loader.py


2. pr_agent/git_providers/utils.py 🐞 Bug fix +4/-1

Call reset function in apply_repo_settings

• Import reset_to_startup_defaults from config_loader
• Call reset_to_startup_defaults() at start of apply_repo_settings() function
• Add comment explaining the purpose of the reset call

pr_agent/git_providers/utils.py


3. tests/unittest/test_apply_repo_settings.py 🧪 Tests +111/-0

Add regression tests for settings state isolation

• Create new test file with TestApplyRepoSettings class
• Add test_repo_settings_from_toml_are_applied() sanity test verifying config loading
• Add test_repo_without_toml_does_not_inherit_previous_repo_settings() regression test
• Implement FakeGitProvider mock and fresh_global_settings fixture for test isolation

tests/unittest/test_apply_repo_settings.py


Grey Divider

Qodo Logo

@qodo-free-for-open-source-projects
Copy link
Copy Markdown
Contributor

qodo-free-for-open-source-projects Bot commented Apr 22, 2026

Code Review by Qodo

🐞 Bugs (3) 📘 Rule violations (0) 📎 Requirement gaps (0)

Grey Divider


Action required

1. Auto flag wiped 🐞 Bug ≡ Correctness
Description
apply_repo_settings() now resets settings to a startup snapshot on every call, which clears
config.is_auto_command that webhook servers set before invoking PRAgent.handle_request().
Auto-command flows can then behave like manual flows (e.g., publishing temporary/progress comments)
because the flag is read later by tools but is removed by the second apply_repo_settings() call.
Code

pr_agent/git_providers/utils.py[R14-19]

def apply_repo_settings(pr_url):
    os.environ["AUTO_CAST_FOR_DYNACONF"] = "false"
+    # Reset first so that .pr_agent.toml from a previously-reviewed repo
+    # cannot leak into this call via the module-level global_settings singleton.
+    reset_to_startup_defaults()
    git_provider = get_git_provider_with_context(pr_url)
Evidence
Servers set config.is_auto_command=True and then call agent.handle_request(), but
PRAgent._handle_request() calls apply_repo_settings() again first; with the new unconditional reset,
the flag is restored to the startup snapshot (typically False) before tools check it to decide
whether to publish comments/progress output.

pr_agent/git_providers/utils.py[14-20]
pr_agent/agent/pr_agent.py[54-57]
pr_agent/servers/github_app.py[395-416]
pr_agent/servers/gitlab_webhook.py[39-59]
pr_agent/servers/bitbucket_server_webhook.py[159-217]
pr_agent/tools/pr_reviewer.py[69-75]
pr_agent/tools/pr_reviewer.py[152-154]
pr_agent/tools/pr_code_suggestions.py[105-107]
pr_agent/tools/pr_description.py[101-103]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`reset_to_startup_defaults()` is invoked unconditionally at the top of `apply_repo_settings()`. In auto-command flows, servers set `config.is_auto_command=True` and then call `PRAgent.handle_request()`, but `PRAgent._handle_request()` calls `apply_repo_settings()` again first. The second call resets settings back to the startup snapshot and clears `config.is_auto_command`, changing downstream behavior.

### Issue Context
This is a regression introduced by adding the reset call; prior behavior relied on `is_auto_command` surviving into tool execution.

### Fix Focus Areas
- pr_agent/git_providers/utils.py[14-20]
- pr_agent/agent/pr_agent.py[54-57]
- pr_agent/servers/github_app.py[395-416]
- pr_agent/servers/gitlab_webhook.py[39-59]

### Suggested fix direction
Implement a mechanism so that the second `apply_repo_settings()` call in the same logical request does **not** clear runtime flags:
- Either avoid calling `apply_repo_settings()` twice (e.g., guard in `PRAgent._handle_request()` when settings were already applied), or
- Preserve and restore key runtime flags (at least `config.is_auto_command`, and any similar request-scoped flags) across `reset_to_startup_defaults()`.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. Reset misses context settings 🐞 Bug ≡ Correctness
Description
reset_to_startup_defaults() mutates only global_settings, but get_settings() prefers
context["settings"] when present, so apply_repo_settings() can run against a different settings
object than the one being reset. In deployments that install a per-request settings clone (e.g.,
Bitbucket app), the leak-prevention reset may not affect the effective settings for that request.
Code

pr_agent/config_loader.py[R102-122]

+def reset_to_startup_defaults():
+    """Restore ``global_settings`` to the state captured at the first call.
+
+    Must be invoked at the top of ``apply_repo_settings()`` so that sections
+    set by a previously-reviewed repo's ``.pr_agent.toml`` do not leak into
+    subsequent PRs processed by the same Python process.
+    """
+    global _STARTUP_SETTINGS_SNAPSHOT
+    if _STARTUP_SETTINGS_SNAPSHOT is None:
+        _STARTUP_SETTINGS_SNAPSHOT = copy.deepcopy(global_settings.as_dict())
+        return  # nothing to restore on the very first call
+
+    current_sections = set(global_settings.as_dict().keys())
+    snapshot_sections = set(_STARTUP_SETTINGS_SNAPSHOT.keys())
+    # Remove sections that appeared after startup (i.e. injected by apply_repo_settings).
+    for section in current_sections - snapshot_sections:
+        global_settings.unset(section)
+    # Restore pre-existing sections to their startup values.
+    for section, contents in _STARTUP_SETTINGS_SNAPSHOT.items():
+        global_settings.unset(section)
+        global_settings.set(section, copy.deepcopy(contents), merge=False)
Evidence
get_settings() returns context["settings"] when available. Some servers explicitly set
context["settings"] to a deep-copied clone of global_settings. apply_repo_settings() reads/writes
via get_settings(), but reset_to_startup_defaults() always operates on global_settings, leaving the
context clone untouched.

pr_agent/config_loader.py[48-62]
pr_agent/config_loader.py[102-122]
pr_agent/servers/bitbucket_app.py[266-273]
pr_agent/git_providers/utils.py[14-20]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`reset_to_startup_defaults()` resets `global_settings` only, but `apply_repo_settings()` mutates `get_settings()`, which may be `context["settings"]` (a per-request clone). This can make the reset ineffective for the settings object actually used during request handling.

### Issue Context
`get_settings()` always tries `context["settings"]` first; Bitbucket app installs it as `copy.deepcopy(global_settings)`.

### Fix Focus Areas
- pr_agent/config_loader.py[48-62]
- pr_agent/config_loader.py[102-122]
- pr_agent/git_providers/utils.py[14-20]
- pr_agent/servers/bitbucket_app.py[266-273]

### Suggested fix direction
Change the reset logic to reset the *effective* settings object:
- Make `reset_to_startup_defaults()` operate on `get_settings()` (or accept a Dynaconf instance to reset), and
- Apply the startup snapshot contents into that object (context clone or global), so the reset actually affects what `apply_repo_settings()` will read/modify.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

3. Non-atomic global reset 🐞 Bug ☼ Reliability
Description
reset_to_startup_defaults() performs many unset/set operations on the module-level singleton, which
can race with other in-flight webhook processing and cause mid-run configuration changes. This risk
increases because the reset runs even when a repo has no .pr_agent.toml (previously that path was a
no-op).
Code

pr_agent/config_loader.py[R114-122]

+    current_sections = set(global_settings.as_dict().keys())
+    snapshot_sections = set(_STARTUP_SETTINGS_SNAPSHOT.keys())
+    # Remove sections that appeared after startup (i.e. injected by apply_repo_settings).
+    for section in current_sections - snapshot_sections:
+        global_settings.unset(section)
+    # Restore pre-existing sections to their startup values.
+    for section, contents in _STARTUP_SETTINGS_SNAPSHOT.items():
+        global_settings.unset(section)
+        global_settings.set(section, copy.deepcopy(contents), merge=False)
Evidence
The reset iterates sections and mutates global_settings repeatedly, with no synchronization. Webhook
servers schedule background tasks that can overlap across requests in the same process; those tasks
call PRAgent.handle_request() which calls apply_repo_settings(), so another request can reset global
settings while a prior request’s tools are still running and reading get_settings().

pr_agent/config_loader.py[114-122]
pr_agent/git_providers/utils.py[14-20]
pr_agent/servers/bitbucket_server_webhook.py[190-217]
pr_agent/agent/pr_agent.py[54-57]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
Resetting `global_settings` via multiple `unset()`/`set()` calls is not atomic. Under concurrent webhook processing (especially with background tasks), one request can reset settings while another is executing tools that read `get_settings()`, leading to inconsistent behavior.

### Issue Context
This PR increases the frequency of global resets by running them for every `apply_repo_settings()` call, even when the repo has no `.pr_agent.toml`.

### Fix Focus Areas
- pr_agent/config_loader.py[114-122]
- pr_agent/git_providers/utils.py[14-20]
- pr_agent/servers/bitbucket_server_webhook.py[190-217]

### Suggested fix direction
Mitigate cross-request interference by either:
- Enforcing per-request settings isolation everywhere (preferred), or
- Adding a process-wide lock around reset+apply operations so `global_settings` cannot be mutated concurrently.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

Qodo Logo

Comment on lines 14 to 19
def apply_repo_settings(pr_url):
os.environ["AUTO_CAST_FOR_DYNACONF"] = "false"
# Reset first so that .pr_agent.toml from a previously-reviewed repo
# cannot leak into this call via the module-level global_settings singleton.
reset_to_startup_defaults()
git_provider = get_git_provider_with_context(pr_url)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

1. Auto flag wiped 🐞 Bug ≡ Correctness

apply_repo_settings() now resets settings to a startup snapshot on every call, which clears
config.is_auto_command that webhook servers set before invoking PRAgent.handle_request().
Auto-command flows can then behave like manual flows (e.g., publishing temporary/progress comments)
because the flag is read later by tools but is removed by the second apply_repo_settings() call.
Agent Prompt
### Issue description
`reset_to_startup_defaults()` is invoked unconditionally at the top of `apply_repo_settings()`. In auto-command flows, servers set `config.is_auto_command=True` and then call `PRAgent.handle_request()`, but `PRAgent._handle_request()` calls `apply_repo_settings()` again first. The second call resets settings back to the startup snapshot and clears `config.is_auto_command`, changing downstream behavior.

### Issue Context
This is a regression introduced by adding the reset call; prior behavior relied on `is_auto_command` surviving into tool execution.

### Fix Focus Areas
- pr_agent/git_providers/utils.py[14-20]
- pr_agent/agent/pr_agent.py[54-57]
- pr_agent/servers/github_app.py[395-416]
- pr_agent/servers/gitlab_webhook.py[39-59]

### Suggested fix direction
Implement a mechanism so that the second `apply_repo_settings()` call in the same logical request does **not** clear runtime flags:
- Either avoid calling `apply_repo_settings()` twice (e.g., guard in `PRAgent._handle_request()` when settings were already applied), or
- Preserve and restore key runtime flags (at least `config.is_auto_command`, and any similar request-scoped flags) across `reset_to_startup_defaults()`.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Comment thread pr_agent/config_loader.py
Comment on lines +102 to +122
def reset_to_startup_defaults():
"""Restore ``global_settings`` to the state captured at the first call.

Must be invoked at the top of ``apply_repo_settings()`` so that sections
set by a previously-reviewed repo's ``.pr_agent.toml`` do not leak into
subsequent PRs processed by the same Python process.
"""
global _STARTUP_SETTINGS_SNAPSHOT
if _STARTUP_SETTINGS_SNAPSHOT is None:
_STARTUP_SETTINGS_SNAPSHOT = copy.deepcopy(global_settings.as_dict())
return # nothing to restore on the very first call

current_sections = set(global_settings.as_dict().keys())
snapshot_sections = set(_STARTUP_SETTINGS_SNAPSHOT.keys())
# Remove sections that appeared after startup (i.e. injected by apply_repo_settings).
for section in current_sections - snapshot_sections:
global_settings.unset(section)
# Restore pre-existing sections to their startup values.
for section, contents in _STARTUP_SETTINGS_SNAPSHOT.items():
global_settings.unset(section)
global_settings.set(section, copy.deepcopy(contents), merge=False)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

2. Reset misses context settings 🐞 Bug ≡ Correctness

reset_to_startup_defaults() mutates only global_settings, but get_settings() prefers
context["settings"] when present, so apply_repo_settings() can run against a different settings
object than the one being reset. In deployments that install a per-request settings clone (e.g.,
Bitbucket app), the leak-prevention reset may not affect the effective settings for that request.
Agent Prompt
### Issue description
`reset_to_startup_defaults()` resets `global_settings` only, but `apply_repo_settings()` mutates `get_settings()`, which may be `context["settings"]` (a per-request clone). This can make the reset ineffective for the settings object actually used during request handling.

### Issue Context
`get_settings()` always tries `context["settings"]` first; Bitbucket app installs it as `copy.deepcopy(global_settings)`.

### Fix Focus Areas
- pr_agent/config_loader.py[48-62]
- pr_agent/config_loader.py[102-122]
- pr_agent/git_providers/utils.py[14-20]
- pr_agent/servers/bitbucket_app.py[266-273]

### Suggested fix direction
Change the reset logic to reset the *effective* settings object:
- Make `reset_to_startup_defaults()` operate on `get_settings()` (or accept a Dynaconf instance to reset), and
- Apply the startup snapshot contents into that object (context clone or global), so the reset actually affects what `apply_repo_settings()` will read/modify.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

@tharatynowicz
Copy link
Copy Markdown
Author

Thanks for the review. Going through the three findings:

1. "Auto flag wiped" ‚ confirmed bug, introduced by this PR.

The flow in bitbucket_server_webhook.py sets config.is_auto_command=True after the first apply_repo_settings(), then _process_command() calls apply_repo_settings() again per command, and my unconditional reset now wipes the flag before tools read it. Same shape exists in the GitHub/GitLab webhook paths. This needs to be addressed before merge.

2. "Reset misses context settings" ‚ architecturally valid.

In deployments that install a per-request context["settings"] clone (notably bitbucket_app.py), my reset mutates the unused global_settings while apply_repo_settings() writes into the clone. Worth noting that in those deployments the leak this PR targets doesn't occur in the first place (each request starts from a fresh clone of the current global_settings), so the reset is a no-op rather than incorrect ‚ÄĒ but making it operate on get_settings() is the cleaner design.

3. "Non-atomic global reset" ‚ theoretical in the current webhook deployments.

In single-worker async servers the event loop serializes; in multi-worker (--workers N) each process has its own singleton. The concern applies to concurrent background tasks within one process; acknowledged but I don't have a deterministic reproducer.

Two directions to fix #1 and #2

a) Minimal iteration on this PR: reset_to_startup_defaults() operates on get_settings() (addresses #2), and preserves a known set of runtime flags ‚ at least config.is_auto_command and config.is_new_pr ‚ÄĒ across the reset (addresses #1). ~15 extra lines, keeps the PR tight.

b) Larger refactor: push per-request settings isolation through RawContextMiddleware in every webhook target, so get_settings() always returns a per-request clone. Eliminates the singleton-mutation class of bug entirely and makes #3 moot. Much bigger change, touches every server entry point.

Happy to do (a) on this PR if that's acceptable, or open a separate PR for (b) if you'd prefer to go that route instead. Which direction do you prefer?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

apply_repo_settings() leaks .pr_agent.toml state across PRs from different repositories

1 participant