Skip to content

fix: MPS memory reporting and cache synchronization#13287

Open
snwchd71 wants to merge 2 commits intoComfy-Org:masterfrom
snwchd71:fix/mps-memory-reporting-and-cache-sync
Open

fix: MPS memory reporting and cache synchronization#13287
snwchd71 wants to merge 2 commits intoComfy-Org:masterfrom
snwchd71:fix/mps-memory-reporting-and-cache-sync

Conversation

@snwchd71
Copy link
Copy Markdown

@snwchd71 snwchd71 commented Apr 4, 2026

Summary

MPS memory management currently treats MPS identically to CPU, using only psutil.virtual_memory(). This causes ComfyUI to make poor memory decisions on Apple Silicon, it doesn't know how much MPS-tracked memory is actually in use, and synchronize() skips MPS entirely.

This PR:

  • Splits MPS from CPU in get_total_memory() and get_free_memory() to use torch.mps.recommended_max_memory(), torch.mps.driver_allocated_memory(), and torch.mps.current_allocated_memory() for accurate reporting
  • Adds torch.mps.synchronize() to synchronize() (was missing entirely) and before empty_cache() in soft_empty_cache()
  • Adds inter-node MPS cache flush in execution.py — when torch free memory drops below 25% of total free between node executions, flushes the MPS caching allocator to prevent memory hoarding during long denoising loops

All torch.mps.* APIs are available since PyTorch 2.1 (minimum requirement is 2.5). Fallbacks are provided via except Exception for safety.

Context

  • MPS caching allocator has known memory leaks (MPS Memory Leaks across core MPS files pytorch/pytorch#164299) where the graphCache retains memory not tracked by torch.mps.current_allocated_memory()
  • During denoising (hundreds of layers × 4+ steps), the allocator accumulates freed blocks monotonically since soft_empty_cache() only fires between nodes
  • On 64GB M5 Pro, memory spirals from ~30GB to 54GB+ after 3-5 runs without these fixes
  • Complementary to fix: CPU fallback for FP8 quantization on MPS (Apple Silicon) #12378 (FP8 quantization CPU fallback) — that PR fixes FP8 dtype crashes, this PR fixes memory management. Both are needed for stable FP8 inference on MPS.

Test plan

  • Verified torch.mps.recommended_max_memory(), driver_allocated_memory(), current_allocated_memory() return correct values on M5 Pro
  • soft_empty_cache() now properly synchronizes before clearing
  • Inter-node flush triggers when memory pressure is high, not when it's low
  • No impact on CUDA/CPU code paths (MPS branches are gated by dev.type == 'mps' / mps_mode())

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 4, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

get_total_memory() and get_free_memory() now treat CPU and Apple MPS separately: CPU reads psutil values; MPS derives totals from torch.mps.recommended_max_memory() (with psutil fallback) and computes free from driver/current allocation (clamped/combined with psutil, with exception fallbacks). When CPUState == MPS, vram_state is overridden to set_vram_to only for LOW_VRAM/NO_VRAM; otherwise it remains SHARED. synchronize() and soft_empty_cache() call torch.mps.synchronize() in MPS mode before emptying. execution.py adds an MPS branch that calls soft_empty_cache() when torch free memory is under 25% of total.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main objective: fixing MPS memory reporting and cache synchronization on Apple Silicon.
Description check ✅ Passed The description provides comprehensive context about MPS memory issues, explains all changes made, and includes testing verification.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@execution.py`:
- Around line 780-787: The MPS cache flush branch is currently unreachable when
CacheType.RAM_PRESSURE is set and its threshold is inverted; change the `elif
comfy.model_management.mps_mode():` to an independent `if
comfy.model_management.mps_mode():` so the MPS path runs even when
`free_memory(...)` was called, then flip the comparison to flush when the
allocator cache is large (use `if mem_free_torch > 0 and mem_free_torch >
mem_free_total * 0.25:`) so `comfy.model_management.get_free_memory(...)`
triggers `comfy.model_management.soft_empty_cache()` only when the reusable
torch cache is high (and avoid triggering on a zero fallback).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 8004979d-50a4-4946-a801-d7291c61da16

📥 Commits

Reviewing files that changed from the base of the PR and between f21f6b2 and 12dddef.

📒 Files selected for processing (2)
  • comfy/model_management.py
  • execution.py

Comment on lines 780 to +787
if self.cache_type == CacheType.RAM_PRESSURE:
comfy.model_management.free_memory(0, None, pins_required=ram_headroom, ram_required=ram_headroom)
comfy.memory_management.extra_ram_release(ram_headroom)
elif comfy.model_management.mps_mode():
mem_free_total, mem_free_torch = comfy.model_management.get_free_memory(
comfy.model_management.get_torch_device(), torch_free_too=True)
if mem_free_torch < mem_free_total * 0.25:
comfy.model_management.soft_empty_cache()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Make the new MPS cache flush reachable and flip the threshold.

Line 783 makes this branch unreachable whenever CacheType.RAM_PRESSURE is enabled, even though the preceding free_memory(0, None, ...) call will not hit the MPS cache flush in comfy/model_management.py, Lines 717-721 because that path requires device is not None. Line 786 is also reversed: mem_free_torch is the allocator's reusable cache, so the flush should happen when that value is high, not low. As written, the new cleanup misses the hoarding case, and if the MPS stats fallback in comfy/model_management.py, Lines 1497-1505 ever returns mem_free_torch = 0, this condition will sync+empty after every node.

Suggested fix
                     if self.cache_type == CacheType.RAM_PRESSURE:
                         comfy.model_management.free_memory(0, None, pins_required=ram_headroom, ram_required=ram_headroom)
                         comfy.memory_management.extra_ram_release(ram_headroom)
-                    elif comfy.model_management.mps_mode():
+                    if comfy.model_management.mps_mode():
                         mem_free_total, mem_free_torch = comfy.model_management.get_free_memory(
                             comfy.model_management.get_torch_device(), torch_free_too=True)
-                        if mem_free_torch < mem_free_total * 0.25:
+                        if mem_free_torch > mem_free_total * 0.25:
                             comfy.model_management.soft_empty_cache()
As per coding guidelines, `comfy/**` reviews should focus on memory management, GPU resource handling, and performance implications in hot paths.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@execution.py` around lines 780 - 787, The MPS cache flush branch is currently
unreachable when CacheType.RAM_PRESSURE is set and its threshold is inverted;
change the `elif comfy.model_management.mps_mode():` to an independent `if
comfy.model_management.mps_mode():` so the MPS path runs even when
`free_memory(...)` was called, then flip the comparison to flush when the
allocator cache is large (use `if mem_free_torch > 0 and mem_free_torch >
mem_free_total * 0.25:`) so `comfy.model_management.get_free_memory(...)`
triggers `comfy.model_management.soft_empty_cache()` only when the reusable
torch cache is high (and avoid triggering on a zero fallback).

@snwchd71 snwchd71 force-pushed the fix/mps-memory-reporting-and-cache-sync branch 2 times, most recently from fc33bfe to 9026aad Compare April 4, 2026 20:29
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
execution.py (1)

783-787: ⚠️ Potential issue | 🟠 Major

Make the MPS cache flush reachable and flip the threshold.

Line 781's free_memory(0, None, ...) cannot hit the MPS cache path in comfy.model_management.free_memory() because that branch only runs when device is not None. With elif, the new cleanup is skipped whenever CacheType.RAM_PRESSURE is active, and the comparison is backwards: mem_free_torch is the reusable allocator cache returned by get_free_memory(..., torch_free_too=True), so the hoarding case is > 25%, not < 25%. (docs.pytorch.org)

Suggested fix
                     if self.cache_type == CacheType.RAM_PRESSURE:
                         comfy.model_management.free_memory(0, None, pins_required=ram_headroom, ram_required=ram_headroom)
                         comfy.memory_management.extra_ram_release(ram_headroom)
-                    elif comfy.model_management.mps_mode():
+                    if comfy.model_management.mps_mode():
                         mem_free_total, mem_free_torch = comfy.model_management.get_free_memory(
                             comfy.model_management.get_torch_device(), torch_free_too=True)
-                        if mem_free_torch < mem_free_total * 0.25:
+                        if mem_free_torch > 0 and mem_free_torch > mem_free_total * 0.25:
                             comfy.model_management.soft_empty_cache()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@execution.py` around lines 783 - 787, The MPS cache flush branch is
unreachable under the current elif and uses the wrong comparison; change the
`elif comfy.model_management.mps_mode():` to an independent `if
comfy.model_management.mps_mode():` so it still runs when RAM_PRESSURE logic
executes, and invert the threshold check to `if mem_free_torch > mem_free_total
* 0.25:` (using the values returned by
`comfy.model_management.get_free_memory(comfy.model_management.get_torch_device(),
torch_free_too=True)`) and call `comfy.model_management.soft_empty_cache()` when
that condition is true.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@comfy/model_management.py`:
- Around line 447-449: The code sets VRAMState.SHARED for CPUState.MPS but later
a block force-assigns VRAMState.DISABLED for every non-CPUState.GPU backend,
which overwrites the MPS shared setting; update the logic that sets
VRAMState.DISABLED (the branch checking for non-CPUState.GPU) to skip or
preserve VRAMState when cpu_state == CPUState.MPS or when vram_state is already
VRAMState.SHARED so the MPS path can reach the shared-memory handling code
instead of being overwritten.
- Around line 912-920: The MPS memory check uses
torch.mps.recommended_max_memory() as if it were free budget; change the
calculation to subtract torch.mps.current_allocated_memory() so remaining
headroom is used. In the block guarded by vram_state == VRAMState.SHARED (where
model_size is computed from dtype_size(dtype) * parameters), compute mem_budget
as recommended_max_memory() - current_allocated_memory() (falling back to
get_free_memory(torch_dev) if the torch.mps calls fail) and then compare
model_size against mem_budget * 0.4 to decide between torch_dev and cpu_dev.
Ensure you reference torch.mps.recommended_max_memory() and
torch.mps.current_allocated_memory() and keep the existing fallback to
get_free_memory(torch_dev).

---

Duplicate comments:
In `@execution.py`:
- Around line 783-787: The MPS cache flush branch is unreachable under the
current elif and uses the wrong comparison; change the `elif
comfy.model_management.mps_mode():` to an independent `if
comfy.model_management.mps_mode():` so it still runs when RAM_PRESSURE logic
executes, and invert the threshold check to `if mem_free_torch > mem_free_total
* 0.25:` (using the values returned by
`comfy.model_management.get_free_memory(comfy.model_management.get_torch_device(),
torch_free_too=True)`) and call `comfy.model_management.soft_empty_cache()` when
that condition is true.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: d2689583-3a42-4304-860f-81d539680a2d

📥 Commits

Reviewing files that changed from the base of the PR and between 12dddef and fc33bfe.

📒 Files selected for processing (2)
  • comfy/model_management.py
  • execution.py

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (2)
comfy/model_management.py (1)

447-449: ⚠️ Potential issue | 🟠 Major

VRAMState.SHARED is immediately overwritten to DISABLED.

The new assignment at Lines 447-449 sets VRAMState.SHARED for MPS, but the existing code at Line 468-469 (if cpu_state != CPUState.GPU: vram_state = VRAMState.DISABLED) runs afterwards and overwrites it. This makes the VRAMState.SHARED branch at Lines 912-920 unreachable.

Suggested fix

Change Line 468 to only target CPU, preserving SHARED for MPS:

-if cpu_state != CPUState.GPU:
+if cpu_state == CPUState.CPU:
     vram_state = VRAMState.DISABLED
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@comfy/model_management.py` around lines 447 - 449, The VRAMState.SHARED
assignment for CPUState.MPS is being clobbered later by the unconditional check
"if cpu_state != CPUState.GPU: vram_state = VRAMState.DISABLED"; update that
later condition so it does not overwrite the MPS case — e.g., change the check
to only set DISABLED for the pure CPU path (use cpu_state == CPUState.CPU or
explicitly exclude CPUState.MPS) so VRAMState.SHARED assigned when cpu_state ==
CPUState.MPS remains effective; look for symbols cpu_state, CPUState.MPS,
VRAMState.SHARED, and the "if cpu_state != CPUState.GPU" branch to modify.
execution.py (1)

783-787: ⚠️ Potential issue | 🟠 Major

MPS cache flush branch is unreachable under RAM_PRESSURE and has inverted threshold logic.

The elif makes this branch unreachable when CacheType.RAM_PRESSURE is enabled. Additionally, the comparison at Line 786 is inverted compared to the analogous CUDA logic in free_memory() (Lines 719-721 of model_management.py), which flushes when mem_free_torch > mem_free_total * 0.25. The current < check would flush when the cache is small (not hoarding), which contradicts the stated PR goal.

Suggested fix
                     if self.cache_type == CacheType.RAM_PRESSURE:
                         comfy.model_management.free_memory(0, None, pins_required=ram_headroom, ram_required=ram_headroom)
                         comfy.memory_management.extra_ram_release(ram_headroom)
-                    elif comfy.model_management.mps_mode():
+                    if comfy.model_management.mps_mode():
                         mem_free_total, mem_free_torch = comfy.model_management.get_free_memory(
                             comfy.model_management.get_torch_device(), torch_free_too=True)
-                        if mem_free_torch < mem_free_total * 0.25:
+                        if mem_free_torch > mem_free_total * 0.25:
                             comfy.model_management.soft_empty_cache()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@execution.py` around lines 783 - 787, The MPS branch is currently unreachable
under CacheType.RAM_PRESSURE because it uses elif and its threshold logic is
inverted; change the `elif comfy.model_management.mps_mode()` to an independent
`if` so it runs when MPS is active even under RAM_PRESSURE, and invert the
comparison in the check that calls `comfy.model_management.soft_empty_cache()`
from `mem_free_torch < mem_free_total * 0.25` to `mem_free_torch >
mem_free_total * 0.25` so it matches the CUDA logic in `free_memory()` and only
flushes when the Torch memory is hoarding; keep the calls to
`comfy.model_management.get_free_memory()` and `get_torch_device()` unchanged.
🧹 Nitpick comments (1)
comfy/model_management.py (1)

912-920: Memory budget should account for current allocation.

Once the VRAMState.SHARED overwrite issue is fixed, this logic will become reachable. However, torch.mps.recommended_max_memory() represents total recommended budget, not remaining headroom. A device that already has 80% allocated could still pass the 40% check. Consider subtracting current allocation for accurate headroom:

Suggested improvement
     if vram_state == VRAMState.SHARED:
         model_size = dtype_size(dtype) * parameters
         try:
-            mem_budget = torch.mps.recommended_max_memory()
+            mem_budget = max(
+                torch.mps.recommended_max_memory() - torch.mps.current_allocated_memory(),
+                0,
+            )
         except Exception:
             mem_budget = get_free_memory(torch_dev)
         if model_size < mem_budget * 0.4:
             return torch_dev
         return cpu_dev
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@comfy/model_management.py` around lines 912 - 920, The check using
torch.mps.recommended_max_memory() must account for already-allocated memory
before comparing to model_size; update the VRAMState.SHARED branch so mem_budget
represents remaining headroom (either by calling get_free_memory(torch_dev)
directly or by subtracting current allocation from
torch.mps.recommended_max_memory(), e.g. current_alloc =
get_used_memory(torch_dev) or similar helper and then headroom = mem_budget -
current_alloc) and then compare model_size against headroom * 0.4; modify the
block around VRAMState.SHARED (references: dtype_size, parameters,
torch.mps.recommended_max_memory, get_free_memory, torch_dev, cpu_dev) to use
that headroom value.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@comfy/model_management.py`:
- Around line 447-449: The VRAMState.SHARED assignment for CPUState.MPS is being
clobbered later by the unconditional check "if cpu_state != CPUState.GPU:
vram_state = VRAMState.DISABLED"; update that later condition so it does not
overwrite the MPS case — e.g., change the check to only set DISABLED for the
pure CPU path (use cpu_state == CPUState.CPU or explicitly exclude CPUState.MPS)
so VRAMState.SHARED assigned when cpu_state == CPUState.MPS remains effective;
look for symbols cpu_state, CPUState.MPS, VRAMState.SHARED, and the "if
cpu_state != CPUState.GPU" branch to modify.

In `@execution.py`:
- Around line 783-787: The MPS branch is currently unreachable under
CacheType.RAM_PRESSURE because it uses elif and its threshold logic is inverted;
change the `elif comfy.model_management.mps_mode()` to an independent `if` so it
runs when MPS is active even under RAM_PRESSURE, and invert the comparison in
the check that calls `comfy.model_management.soft_empty_cache()` from
`mem_free_torch < mem_free_total * 0.25` to `mem_free_torch > mem_free_total *
0.25` so it matches the CUDA logic in `free_memory()` and only flushes when the
Torch memory is hoarding; keep the calls to
`comfy.model_management.get_free_memory()` and `get_torch_device()` unchanged.

---

Nitpick comments:
In `@comfy/model_management.py`:
- Around line 912-920: The check using torch.mps.recommended_max_memory() must
account for already-allocated memory before comparing to model_size; update the
VRAMState.SHARED branch so mem_budget represents remaining headroom (either by
calling get_free_memory(torch_dev) directly or by subtracting current allocation
from torch.mps.recommended_max_memory(), e.g. current_alloc =
get_used_memory(torch_dev) or similar helper and then headroom = mem_budget -
current_alloc) and then compare model_size against headroom * 0.4; modify the
block around VRAMState.SHARED (references: dtype_size, parameters,
torch.mps.recommended_max_memory, get_free_memory, torch_dev, cpu_dev) to use
that headroom value.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: b6fcf114-de38-481f-841d-d12e32ee26ca

📥 Commits

Reviewing files that changed from the base of the PR and between fc33bfe and 9026aad.

📒 Files selected for processing (2)
  • comfy/model_management.py
  • execution.py

@snwchd71 snwchd71 force-pushed the fix/mps-memory-reporting-and-cache-sync branch from 9026aad to 8eb4c4d Compare April 4, 2026 20:41
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
execution.py (1)

783-787: ⚠️ Potential issue | 🟠 Major

MPS cache flush logic has reachability and threshold issues.

Two concerns with this branch:

  1. Unreachable with RAM_PRESSURE: Using elif means this never executes when CacheType.RAM_PRESSURE is enabled, even though MPS users might want both behaviors.

  2. Inverted threshold: mem_free_torch represents cached/reclaimable memory in the allocator. The flush should trigger when this is high (allocator hoarding), not low. Current logic flushes when there's little to reclaim.

Additionally, the exception fallback in get_free_memory() sets mem_free_torch = 0, which would make 0 < mem_free_total * 0.25 always true, triggering flush after every node.

Suggested fix
                     if self.cache_type == CacheType.RAM_PRESSURE:
                         comfy.model_management.free_memory(0, None, pins_required=ram_headroom, ram_required=ram_headroom)
                         comfy.memory_management.extra_ram_release(ram_headroom)
-                    elif comfy.model_management.mps_mode():
+                    if comfy.model_management.mps_mode():
                         mem_free_total, mem_free_torch = comfy.model_management.get_free_memory(
                             comfy.model_management.get_torch_device(), torch_free_too=True)
-                        if mem_free_torch < mem_free_total * 0.25:
+                        if mem_free_torch > 0 and mem_free_torch > mem_free_total * 0.25:
                             comfy.model_management.soft_empty_cache()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@execution.py` around lines 783 - 787, The MPS branch is currently unreachable
when RAM_PRESSURE runs (because it uses elif) and uses an inverted threshold and
false-positive behavior when get_free_memory() returns 0; change the branch to
an independent if (use comfy.model_management.mps_mode()) so it can run
alongside RAM_PRESSURE, invert the comparison to flush when mem_free_torch is
high (use mem_free_torch > mem_free_total * 0.25), and guard against the
exception fallback by treating mem_free_torch == 0 as unknown/skip (only
compare/flush when mem_free_torch > 0); use the existing calls
comfy.model_management.get_free_memory(comfy.model_management.get_torch_device(),
torch_free_too=True) and call comfy.model_management.soft_empty_cache() when the
adjusted condition is met.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@execution.py`:
- Around line 783-787: The MPS branch is currently unreachable when RAM_PRESSURE
runs (because it uses elif) and uses an inverted threshold and false-positive
behavior when get_free_memory() returns 0; change the branch to an independent
if (use comfy.model_management.mps_mode()) so it can run alongside RAM_PRESSURE,
invert the comparison to flush when mem_free_torch is high (use mem_free_torch >
mem_free_total * 0.25), and guard against the exception fallback by treating
mem_free_torch == 0 as unknown/skip (only compare/flush when mem_free_torch >
0); use the existing calls
comfy.model_management.get_free_memory(comfy.model_management.get_torch_device(),
torch_free_too=True) and call comfy.model_management.soft_empty_cache() when the
adjusted condition is met.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: ceb95581-0eb8-431e-99b5-398e83b80490

📥 Commits

Reviewing files that changed from the base of the PR and between 9026aad and 8eb4c4d.

📒 Files selected for processing (2)
  • comfy/model_management.py
  • execution.py

- Split MPS from CPU in get_total_memory() and get_free_memory() to use
  torch.mps APIs (recommended_max_memory, driver_allocated_memory,
  current_allocated_memory) instead of relying solely on psutil
- Add torch.mps.synchronize() to synchronize() and soft_empty_cache()
- Add inter-node MPS cache flush when torch free memory is low
- Move MPS SHARED vram assignment before --lowvram/--novram checks so
  users can override it for large models
@snwchd71 snwchd71 force-pushed the fix/mps-memory-reporting-and-cache-sync branch from 8eb4c4d to 33b967d Compare April 4, 2026 21:05
- Add torch.mps.synchronize() to synchronize() function, mirroring
  the existing CUDA and XPU paths
- Add torch.mps.synchronize() before torch.mps.empty_cache() in
  soft_empty_cache() to ensure pending MPS operations complete before
  releasing cached memory
- Allow --lowvram and --novram flags to take effect on MPS devices.
  Previously, MPS unconditionally set vram_state to SHARED regardless
  of user flags. Now respects set_vram_to when LOW_VRAM or NO_VRAM
  is requested, while defaulting to SHARED otherwise.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant