diff --git a/examples/compact_mask/README.md b/examples/compact_mask/README.md
new file mode 100644
index 0000000000..a24b240dc4
--- /dev/null
+++ b/examples/compact_mask/README.md
@@ -0,0 +1,591 @@
+# CompactMask — Memory-Efficient Mask Storage
+
+This example benchmarks `CompactMask`, a new mask representation introduced in `supervision` that replaces dense `(N, H, W)` boolean arrays with a crop-scoped Run-Length Encoding (RLE). The benchmark demonstrates full API compatibility, massive memory savings, and order-of-magnitude annotation speedups — with no change to your existing `Detections` code.
+
+---
+
+## The Problem
+
+Instance segmentation models return one boolean mask per detected object. `supervision` stores these as a stacked `(N, H, W)` numpy array.
+
+For a 4K image with 1 000 detected objects:
+
+```
+1 000 x 3840 x 2160 x 1 byte = 8.3 GB
+```
+
+At this scale, typical pipelines crash with `MemoryError` before a single frame is annotated. Aerial imagery, satellite tiles, and high-density crowd scenes all hit this wall.
+
+---
+
+## The Solution — Crop-RLE Storage
+
+`CompactMask` stores each mask as a run-length encoding of its **bounding-box crop** rather than the full image canvas.
+
+```
+dense (N,H,W) mask   →   N x crop_RLE + N x (x1,y1) offset
+8.3 GB               →   ~280 KB
+```
+
+The bounding boxes are already present in `Detections.xyxy`, so no extra metadata is required from the caller.
+
+### Theoretical analysis (4K scene, 80x80 px objects, ~65% fill per bbox)
+
+Assumptions used throughout the PR design analysis:
+
+| Parameter              | Value                    |
+| ---------------------- | ------------------------ |
+| Image size             | 4K — 3840x2160 = 8.29 MP |
+| Avg bounding box       | 80x80 px = 6 400 px²     |
+| Fill ratio within bbox | ~65%                     |
+| Avg contour vertices   | ~400 pts                 |
+| Avg RLE runs / mask    | ~240 (3 runs x 80 rows)  |
+
+#### Space comparison
+
+| Format              | Per object     | N=100  | N=1 000    | vs Dense  |
+| ------------------- | -------------- | ------ | ---------- | --------- |
+| **Dense** (current) | 8.29 MB        | 829 MB | **8.3 GB** | 1x        |
+| Local Crop + Offset | 6.4 KB         | 640 KB | 6.4 MB     | 1 300x    |
+| **Crop-RLE** ✓      | ~2 KB          | 200 KB | **2 MB**   | 4 000x    |
+| Polygon ⚠ lossy     | ~3.2 KB        | 320 KB | 3.2 MB     | 2 600x    |
+| memmap              | 8.29 MB (disk) | 829 MB | 8.3 GB     | 1x (disk) |
+
+Crop-RLE beats Local Crop because it only encodes actual pixel runs, skipping the ~35% background pixels within each bounding box.
+
+#### Encode time: dense array → format
+
+| Format              | Complexity                        | N=10    | N=100   | N=1 000   |
+| ------------------- | --------------------------------- | ------- | ------- | --------- |
+| Local Crop + Offset | O(A) — strided slice from xyxy    | ~0.1 ms | ~1 ms   | ~10 ms    |
+| **Crop RLE**        | O(A) — scan crop rows for runs    | ~0.2 ms | ~2 ms   | ~20 ms    |
+| Polygon             | O(P) — `cv2.findContours` on crop | ~2 ms   | ~20 ms  | ~200 ms   |
+| memmap              | O(I) — write 8.29 MB to disk      | ~80 ms  | ~800 ms | ~8 000 ms |
+
+#### Decode time: format → full (H, W) mask
+
+Required by `MaskAnnotator`, `mask_iou_batch`, `merge()`, etc. Dominant cost at 4K is **allocating and zeroing a 8.29 MB array**, which is identical across all in-memory formats once full materialisation is needed.
+
+| Format                | N=10   | N=100   | N=1 000   |
+| --------------------- | ------ | ------- | --------- |
+| Local Crop / Crop RLE | ~3 ms  | ~30 ms  | ~300 ms   |
+| Polygon               | ~5 ms  | ~50 ms  | ~500 ms   |
+| memmap                | ~80 ms | ~800 ms | ~8 000 ms |
+
+#### Decode time: crop-only path (optimised)
+
+When callers need only the bounding-box region — `MaskAnnotator` crop-paint path, `.area`, `contains_holes`, `filter_segments_by_distance`:
+
+| Format              | Complexity                       | N=10     | N=100   | N=1 000   |
+| ------------------- | -------------------------------- | -------- | ------- | --------- |
+| Local Crop + Offset | O(1) — already stored            | ~0 ms    | ~0 ms   | ~0 ms     |
+| **Crop RLE** ✓      | O(A) — expand ~240 runs          | ~0.02 ms | ~0.2 ms | ~2 ms     |
+| Polygon             | O(A) — `fillPoly` on crop canvas | ~2 ms    | ~20 ms  | ~200 ms   |
+| memmap              | N/A — always full-size           | ~80 ms   | ~800 ms | ~8 000 ms |
+
+Crop RLE's `.crop()` method powers the `MaskAnnotator` optimisation — it never allocates the full image canvas, which is the entire source of the annotation speedup.
+
+#### IoU / NMS at 1 % bbox overlap rate (sparse aerial scene)
+
+| Format              | Strategy                              | N=1 000    |
+| ------------------- | ------------------------------------- | ---------- |
+| Dense (current)     | All pairs, 640² pixel AND             | ~10 000 ms |
+| Local Crop + Offset | Bbox pre-filter → pixel IoU           | **~5 ms**  |
+| Crop RLE            | Bbox pre-filter → expand intersection | **~15 ms** |
+
+At N=1 000 with 1 % overlap, bbox pre-filter reduces 499 500 candidate pairs to ~5 000 overlapping pairs — a ~2 000x reduction in pixel-level work.
+
+---
+
+## Why Crop-RLE Was Chosen over Local Crop
+
+Both formats compress extremely well; the deciding factors for Crop-RLE are:
+
+1. **~3x smaller** for masks that are themselves sparse within their bounding box.
+2. **COCO RLE interop path** — row-major crop RLE can be re-encoded to column-major full-image RLE for `pycocotools` if needed.
+3. `.area` computed directly from run lengths — no materialisation, no allocation.
+
+The main trade-off: crop-only decode is O(A) rather than O(1). For the common solid-fill segmentation mask this is negligible (\<0.1 ms per mask).
+
+---
+
+## Operation-by-Operation Speedup Analysis
+
+This section walks through every `Detections` operation that touches masks and shows exactly why `CompactMask` is faster. All code snippets are taken from the actual implementation. Numbers use the **FHD-200-50%-v600** scenario unless noted (1920 x 1080 image, 200 detections, each mask filling ~50% of the frame, 600-vertex polygons — a realistic hard case with dense fill and complex object boundaries).
+
+At 50% fill on an FHD image each mask's bounding box covers a large portion of the frame, producing many RLE runs per row.
+
+---
+
+### Memory
+
+Dense stores one full-resolution bool array per mask:
+
+```
+N x H x W x 1 byte
+200 x 1080 x 1920 x 1 = 414 MB
+```
+
+Compact stores three lightweight structures:
+
+```python
+self._rles: list[npt.NDArray[np.int32]]  # N Python references to small int32 arrays
+self._crop_shapes: npt.NDArray[np.int32]  # (N, 2) — crop (h, w) per mask
+self._offsets: npt.NDArray[np.int32]  # (N, 2) — (x1, y1) origin per mask
+```
+
+Per-mask RLE size at 50% fill with 600-vertex polygons: ~4.7 KB (933 KB / 200). Per-mask dense size: 1920 x 1080 x 1 = 2.1 MB. Per-mask ratio: 2.1 MB / 4.7 KB = **~445x**.
+
+Scaled to N=200: 200 x 4.7 KB = ~933 KB of RLE data, plus `_crop_shapes` (1.6 KB) and `_offsets` (1.6 KB). Python list + array object overhead roughly doubles the footprint for small N.
+
+| Component       | Dense      | Compact     | Ratio     |
+| --------------- | ---------- | ----------- | --------- |
+| Mask data       | 414 MB     | ~933 KB     | ~445x     |
+| Python overhead | negligible | ~933 KB     | --        |
+| **Total**       | **414 MB** | **~1.9 MB** | **~392x** |
+
+At 5% fill with 8-vertex polygons, the ratio reaches 10 000x–20 000x because crops are tiny and RLEs are extremely short. The benchmark's 4K-200-5%-v8 scenario measures 21 786x (theory) / ~6 000x (malloc). The SAT-200-5%-v8 scenario reaches 62 968x theoretical.
+
+---
+
+### `.area`
+
+Dense `Detections.area` reads every pixel of every mask:
+
+```python
+# detection/core.py — dense path
+return np.array([np.sum(mask) for mask in self.mask])
+# N masks x H x W boolean sums = 200 x 2.1 M = 420 million reads
+```
+
+Compact delegates to `_rle_area`, which sums only the odd-indexed run lengths (the True-pixel runs) in each RLE:
+
+```python
+# detection/compact_mask.py — _rle_area
+return int(np.sum(rle[1::2]))
+```
+
+```python
+# detection/compact_mask.py — CompactMask.area
+return np.array([_rle_area(r) for r in self._rles], dtype=np.int64)
+```
+
+At FHD-200-50%-v600, dense `.area` takes 84.66 ms; compact takes 0.48 ms — a **71x speedup**. At SAT-200-20%-v128 the measured speedup reaches **1 204x** because the dense array is 13.4 GB and each sum must scan the entire canvas.
+
+| Factor                             | Reduction   |
+| ---------------------------------- | ----------- |
+| RLE sums vs full-frame pixel reads | ~4 600x     |
+| int32 arithmetic vs bool reduction | ~2x         |
+| No (H, W) allocation per mask      | latency     |
+| **Combined**                       | **~1 000x** |
+
+---
+
+### `filter` / `__getitem__` (boolean index)
+
+Dense: `masks[bool_array]` triggers NumPy fancy indexing, which allocates a new `(K, H, W)` bool array and copies K full frames:
+
+```python
+# detection/core.py — Detections.__getitem__
+mask = (self.mask[index] if self.mask is not None else None,)
+# For dense ndarray, numpy allocates (K, 2160, 3840) and memcpy's K frames
+```
+
+Compact `CompactMask.__getitem__` converts the boolean index to integer positions and builds a new `CompactMask` from Python list indexing and NumPy fancy indexing on small `(N, 2)` arrays:
+
+```python
+# detection/compact_mask.py — CompactMask.__getitem__
+if isinstance(index, np.ndarray) and index.dtype == bool:
+    idx_arr = np.where(index)[0]
+# ...
+new_rles = [self._rles[int(i)] for i in idx_arr]
+new_crop_shapes: npt.NDArray[np.int32] = self._crop_shapes[idx_arr]
+new_offsets: npt.NDArray[np.int32] = self._offsets[idx_arr]
+return CompactMask(new_rles, new_crop_shapes, new_offsets, self._image_shape)
+```
+
+At FHD-200-50%-v600, dense `filter` takes 14.56 ms; compact takes 0.03 ms — a **500x speedup**. At SAT-200-20%-v128 the speedup reaches **14 757x**.
+
+|             | Dense                   | Compact                             |
+| ----------- | ----------------------- | ----------------------------------- |
+| Data copied | K x H x W (full frames) | K Python references + K x 8 bytes   |
+| Allocation  | new `(K, H, W)` array   | new `CompactMask` shell (~trivial)  |
+| **Speedup** |                         | **hundreds to tens of thousands x** |
+
+---
+
+### `annotate` (`MaskAnnotator`)
+
+Dense: for each mask, `MaskAnnotator` indexes the full `(H, W)` array and applies a boolean mask across the entire scene:
+
+```python
+# annotators/core.py — dense path
+mask = np.asarray(detections.mask[detection_idx], dtype=bool)
+colored_mask[mask] = color.as_bgr()
+```
+
+Each `detections.mask[detection_idx]` for a dense array yields a full `(H, W)` view, and the boolean indexing scans all pixels.
+
+Compact: the annotator detects `CompactMask` and paints only the crop region:
+
+```python
+# annotators/core.py — compact path
+x1 = int(compact_mask.offsets[detection_idx, 0])
+y1 = int(compact_mask.offsets[detection_idx, 1])
+crop_m = compact_mask.crop(detection_idx)
+crop_h, crop_w = crop_m.shape
+colored_mask[y1 : y1 + crop_h, x1 : x1 + crop_w][crop_m] = color.as_bgr()
+```
+
+`compact_mask.crop()` decodes the RLE into a `(crop_h, crop_w)` array. At FHD-200-50%-v600, dense `annotate` takes 848.95 ms; compact takes 32.67 ms — a **22x speedup**. At SAT-200-20%-v128 the speedup reaches **89x**.
+
+| Factor                                             | Reduction           |
+| -------------------------------------------------- | ------------------- |
+| Crop decode vs full-frame boolean index (per mask) | crop-size dependent |
+| No full `(H, W)` allocation per integer index      | latency             |
+| x N masks                                          | compounds           |
+| **Combined**                                       | **~26 – 400x**      |
+
+---
+
+### IoU (`mask_iou_batch` / `compact_mask_iou_batch`)
+
+Dense `mask_iou_batch` on N=200, FHD:
+
+```python
+# detection/utils/iou_and_nms.py — _mask_iou_batch_split
+intersection_area = np.logical_and(masks_true[:, None], masks_detection).sum(
+    axis=(2, 3)
+)
+# shape (200, 200, 1080, 1920) — ~80 billion boolean ops
+# .sum(axis=(2,3)) for intersection counts
+# memory_limit splits this into chunks capped at 5 GB scratch
+```
+
+Compact `compact_mask_iou_batch` — three layered optimisations:
+
+**1. Vectorised bbox pre-filter — O(N²) array ops, zero decoding**
+
+```python
+ix1: npt.NDArray[np.int32] = np.maximum(x1a[:, None], x1b[None, :])
+iy1: npt.NDArray[np.int32] = np.maximum(y1a[:, None], y1b[None, :])
+ix2: npt.NDArray[np.int32] = np.minimum(x2a[:, None], x2b[None, :])
+iy2: npt.NDArray[np.int32] = np.minimum(y2a[:, None], y2b[None, :])
+bbox_overlap: npt.NDArray[np.bool_] = (ix1 <= ix2) & (iy1 <= iy2)
+```
+
+At 5% fill, two random masks overlap with probability ~4%. ~96% of the N² pairs get IoU = 0 for free — no pixel work at all.
+
+**2. Sub-crop decode — compare only the intersection region**
+
+```python
+ox_a, oy_a = int(x1a[i]), int(y1a[i])
+sub_a = crops_a[i][ly1 - oy_a : ly2 - oy_a + 1, lx1 - ox_a : lx2 - ox_a + 1]
+
+ox_b, oy_b = int(x1b[j]), int(y1b[j])
+sub_b = crops_b[j][ly1 - oy_b : ly2 - oy_b + 1, lx1 - ox_b : lx2 - ox_b + 1]
+
+inter = int(np.logical_and(sub_a, sub_b).sum())
+```
+
+The intersection sub-region of two overlapping crops is typically far smaller than the full frame.
+
+**3. Crop caching — each mask decoded at most once**
+
+```python
+if i not in crops_a:
+    crops_a[i] = masks_true.crop(i)
+```
+
+Area is obtained from `_rle_area` (sum odd-indexed runs), never touching the pixel grid:
+
+```python
+areas_a: npt.NDArray[np.int64] = masks_true.area
+```
+
+At FHD-200-50%-v600, dense IoU takes 23 915 ms; compact takes 51.58 ms — a **446x speedup**. At 5% fill / sparse scenarios the speedup is even larger because fewer bbox pairs overlap.
+
+| Factor                               | Reduction       |
+| ------------------------------------ | --------------- |
+| Bbox pre-filter at sparse fill       | 25x             |
+| Sub-crop vs full frame per pair      | ~200x           |
+| Area from RLE, not `sum(axis=(1,2))` | ~10x            |
+| No 5 GB scratch allocation           | latency         |
+| **Combined**                         | **~100 – 500x** |
+
+At 20% fill the gaps close — more pairs overlap, larger crops — speedup drops toward the lower end of the range.
+
+---
+
+### NMS (`mask_non_max_suppression`)
+
+Both dense and compact paths now call `mask_iou_batch(masks, masks)` directly, computing exact mask IoU on the original (unresized) masks. There is no intermediate resize step.
+
+```python
+# detection/utils/iou_and_nms.py — NMS (both paths)
+ious = mask_iou_batch(masks, masks, overlap_metric)
+```
+
+`mask_iou_batch` dispatches internally: when passed a `CompactMask` it calls `compact_mask_iou_batch`, applying all three IoU optimisations (bbox pre-filter, sub-crop decode, crop caching). When passed a dense ndarray it runs the chunked pixel-AND path.
+
+All three IoU optimisations apply to the compact path:
+
+| Factor                                | Reduction                    |
+| ------------------------------------- | ---------------------------- |
+| Bbox pre-filter eliminates most pairs | 25x at sparse fill           |
+| Sub-crop decode for remaining pairs   | ~200x                        |
+| Area from RLE, not pixel sum          | ~10x                         |
+| **Combined**                          | **same as IoU: ~100 – 500x** |
+
+At FHD-200-50%-v600, dense NMS takes 5 231 ms; compact takes 48.15 ms — a **481x speedup**. Dense IoU/NMS is skipped for scenarios above 1 GB (4K-200 and SAT-200 tiers); compact NMS still runs on those.
+
+---
+
+### `merge` (`Detections.merge`)
+
+Dense: `np.vstack` allocates a new `(N1+N2, H, W)` array and copies both halves:
+
+```python
+# detection/core.py — dense merge path
+return np.vstack([np.asarray(m) for m in masks])
+# Merging two 100-mask sets at FHD: 2 x 100 x 2.1 MB = 414 MB copied
+```
+
+Compact: `CompactMask.merge` extends a Python list and concatenates two small int32 arrays:
+
+```python
+# detection/compact_mask.py — CompactMask.merge
+new_rles: list[npt.NDArray[np.int32]] = []
+for m in masks_list:
+    new_rles.extend(m._rles)
+
+new_crop_shapes: npt.NDArray[np.int32] = np.concatenate(
+    [m._crop_shapes for m in masks_list], axis=0
+)
+new_offsets: npt.NDArray[np.int32] = np.concatenate(
+    [m._offsets for m in masks_list], axis=0
+)
+```
+
+`list.extend` copies N reference pointers. `np.concatenate` on `(N, 2)` int32 arrays copies N x 8 bytes per array.
+
+At FHD-200-50%-v600, dense merge takes 29.71 ms; compact takes 0.03 ms — a **929x speedup**. At SAT-200-20%-v128 the speedup reaches **89 046x**.
+
+|             | Dense                   | Compact                    |
+| ----------- | ----------------------- | -------------------------- |
+| Data moved  | N x H x W (full frames) | N references + N x 8 bytes |
+| Allocation  | new `(N, H, W)` array   | new `CompactMask` shell    |
+| **Speedup** |                         | **effectively free**       |
+
+**Note:** `Detections.merge` calls `is_empty()` on each input. Before the `len(xyxy) > 0` short-circuit was added, `is_empty()` invoked `__eq__` which called `np.array_equal(self.to_dense(), ...)` — materialising the entire `(N, H, W)` CompactMask to dense just to check emptiness. The fix:
+
+```python
+# detection/core.py — Detections.is_empty (fixed)
+if len(self.xyxy) > 0:
+    return False
+```
+
+This O(1) check avoids the O(N x H x W) dense materialisation that previously dominated compact merge time.
+
+---
+
+### `offset` / `with_offset` (`InferenceSlicer` tile stitching)
+
+Dense `move_masks`: allocates a new `(N, new_H, new_W)` array and copies each mask with shifted slice coordinates — O(N x H x W):
+
+```python
+# detection/utils/masks.py — move_masks
+mask_array = np.full((masks.shape[0], resolution_wh[1], resolution_wh[0]), False)
+# ... source/destination slicing logic ...
+mask_array[:, dst_y1:dst_y2, dst_x1:dst_x2] = masks[:, src_y1:src_y2, src_x1:src_x2]
+```
+
+Compact `with_offset(dx, dy)`: vectorised bounds check first. All new bounding-box positions are computed in a single numpy op. When none overflow the new canvas — the common case in `InferenceSlicer` — the RLE data is not touched at all:
+
+```python
+# detection/compact_mask.py — CompactMask.with_offset (fast path)
+new_offsets = self._offsets + np.array([dx, dy], dtype=np.int32)  # O(N) numpy
+needs_clip = (x1s < 0) | (y1s < 0) | (x2s >= new_w) | (y2s >= new_h)
+if not needs_clip.any():
+    return CompactMask(
+        list(self._rles), self._crop_shapes.copy(), new_offsets, new_image_shape
+    )
+```
+
+When a crop does overflow (e.g. object at a tile edge), only that crop is decoded, sliced, and re-encoded. Masks fully outside bounds get a 1x1 all-False stub without any decoding.
+
+At FHD-200-50%-v600, dense offset takes 42.30 ms; compact takes 0.02 ms — a **2 016x speedup**. At SAT-200-20%-v128 the speedup reaches **290 779x**.
+
+|                   | Dense                                  | Compact (no-clip fast path)          |
+| ----------------- | -------------------------------------- | ------------------------------------ |
+| Work per mask     | allocate `(new_H, new_W)` + copy H x W | add scalar to offset row — O(1)      |
+| N=200 at FHD      | 200 x 2.1 MB = **414 MB** alloc + copy | two numpy ops on `(N, 2)` int32      |
+| Output allocation | new `(N, new_H, new_W)`                | shared RLE list + new `(N, 2)` array |
+| **Speedup**       |                                        | **effectively free (>1 000x)**       |
+
+In the `InferenceSlicer` pipeline the canvas is always expanded by the tile offset, so no crop ever overflows — the fast path is always taken. Clipping only activates for objects that genuinely straddle the image boundary.
+
+---
+
+### `centroids` (`calculate_masks_centroids`)
+
+Dense: `np.tensordot` reads every pixel of every mask to compute weighted coordinate sums:
+
+```python
+# detection/utils/masks.py — dense centroid path
+vertical_indices, horizontal_indices = np.indices((height, width)) + 0.5
+# np.tensordot(masks, indices, axes=([1, 2], [0, 1]))
+# reads all N x H x W values
+```
+
+Compact: per-crop loop decodes only the bounding-box region and computes centroids within that crop:
+
+```python
+# detection/utils/masks.py — compact centroid path
+crop = masks.crop(i)
+crop_h, crop_w = crop.shape
+x1 = int(masks.offsets[i, 0])
+y1 = int(masks.offsets[i, 1])
+# ...
+crop_rows, crop_cols = np.indices((crop_h, crop_w))
+cx = float(np.sum((crop_cols + 0.5)[crop])) / total + x1
+cy = float(np.sum((crop_rows + 0.5)[crop])) / total + y1
+```
+
+At FHD-200-50%-v600, dense centroids takes 1 133.68 ms; compact takes 60.39 ms — a **13x speedup**. At SAT-200-20%-v128 the speedup reaches **857x** because the dense path must allocate and scan a 13.4 GB array.
+
+| Factor                                    | Reduction           |
+| ----------------------------------------- | ------------------- |
+| Crop area vs full frame (per mask)        | fill-dependent      |
+| No global `np.indices((H, W))` allocation | saves large float64 |
+| **Combined (N=200)**                      | **~19 – 1 000x**    |
+
+---
+
+### Summary
+
+Measured speedups at the **FHD-200-50%-v600** operating point (dense fill, complex polygons — a realistic hard case). Dense baseline = 1x.
+
+| Operation        | Dense cost  | Compact cost | Speedup |
+| ---------------- | ----------- | ------------ | ------- |
+| Memory           | 414 MB      | ~1.9 MB      | ~392x   |
+| `.area`          | 84.66 ms    | 0.48 ms      | 71x     |
+| `filter`         | 14.56 ms    | 0.03 ms      | 500x    |
+| `annotate`       | 848.95 ms   | 32.67 ms     | 22x     |
+| `mask_iou_batch` | 23 915 ms   | 51.58 ms     | 446x    |
+| NMS              | 5 231 ms    | 48.15 ms     | 481x    |
+| `merge`          | 29.71 ms    | 0.03 ms      | 929x    |
+| `with_offset`    | 42.30 ms    | 0.02 ms      | 2 016x  |
+| `centroids`      | 1 133.68 ms | 60.39 ms     | 13x     |
+
+All speedups are larger at sparser fill fractions and larger resolutions. At SAT-200-20%-v128, `.area` reaches 1 204x and `merge` reaches 89 046x. At the sparsest scenarios (5% fill, 8-vertex polygons), memory ratios exceed 60 000x.
+
+---
+
+## Drop-In Compatibility
+
+`CompactMask` implements the same duck-typed interface as `np.ndarray`:
+
+```python
+import supervision as sv
+from supervision.detection.compact_mask import CompactMask
+
+# Build from an existing dense (N, H, W) bool array:
+compact = CompactMask.from_dense(masks_dense, xyxy, image_shape=(H, W))
+
+# Use exactly like a dense mask — no other code changes needed:
+detections = sv.Detections(xyxy=xyxy, mask=compact, class_id=class_ids)
+
+# Filtering, merging, area — all work transparently:
+filtered = detections[confidence > 0.5]
+areas = detections.area  # RLE sum, no materialisation
+merged = sv.Detections.merge([det_a, det_b])
+
+# MaskAnnotator works without any change:
+annotated = sv.MaskAnnotator().annotate(frame, detections)
+
+# Materialise back to dense when you need raw numpy:
+dense_again = compact.to_dense()  # (N, H, W) bool
+```
+
+Supported indexing patterns:
+
+| Expression         | Returns                      |
+| ------------------ | ---------------------------- |
+| `mask[i]` (int)    | Dense `(H, W)` bool array    |
+| `mask[bool_array]` | New `CompactMask` (filtered) |
+| `mask[slice]`      | New `CompactMask`            |
+| `np.asarray(mask)` | Dense `(N, H, W)` bool array |
+
+---
+
+## Benchmark
+
+Run on any machine — no GPU or real model required:
+
+```bash
+uv run python examples/compact_mask/benchmark.py
+```
+
+Six image tiers x three fill fractions (5 / 20 / 50 %) x three vertex counts (8 / 128 / 600):
+
+| Tier    | Resolution | Objects | Dense array | Notes                                |
+| ------- | ---------- | ------- | ----------- | ------------------------------------ |
+| FHD-100 | 1920x1080  | 100     | 0.21 GB     | Full operations including IoU+NMS    |
+| FHD-200 | 1920x1080  | 200     | 0.41 GB     | Full operations including IoU+NMS    |
+| FHD-400 | 1920x1080  | 400     | 0.83 GB     | Full operations including IoU+NMS    |
+| 4K-100  | 3840x2160  | 100     | 0.83 GB     | Full operations including IoU+NMS    |
+| 4K-200  | 3840x2160  | 200     | 1.66 GB     | Dense IoU+NMS skipped (array > 1 GB) |
+| SAT-200 | 8192x8192  | 200     | 13.4 GB     | Dense IoU+NMS skipped (array > 1 GB) |
+
+Dense timing is skipped automatically when the dense IoU/NMS array would exceed 1 GB (`IOU_DENSE_SKIP_GB`), preventing swap thrashing. All dense ops are skipped above 16 GB (`DENSE_SKIP_GB`); no scenario in the current matrix reaches that threshold. Memory is always reported as theoretical `NxHxW` bytes.
+
+### Sample results (macOS, Apple M4 Max, REPS=4)
+
+| Scenario         | Dense mem | Compact theor. | Mem x   | Area x | Filter x | Annot x | IoU x | NMS x | Merge x  | Offset x | Centroids x |
+| ---------------- | --------- | -------------- | ------- | ------ | -------- | ------- | ----- | ----- | -------- | -------- | ----------- |
+| FHD-100-5%-v8    | 207 MB    | 28 KB          | 7 418x  | —      | —        | —       | —     | —     | —        | —        | —           |
+| FHD-100-50%-v600 | 207 MB    | 913 KB         | 227x    | —      | —        | —       | —     | —     | —        | —        | —           |
+| FHD-200-50%-v600 | 415 MB    | 933 KB         | 445x    | 71x    | 500x     | 22x     | 446x  | 481x  | 929x     | 2 016x   | 13x         |
+| FHD-400-5%-v8    | 829 MB    | 60 KB          | 13 937x | —      | —        | —       | —     | —     | —        | —        | —           |
+| 4K-100-5%-v8     | 829 MB    | 53 KB          | 15 554x | —      | —        | —       | —     | —     | —        | —        | —           |
+| 4K-100-20%-v128  | 829 MB    | 586 KB         | 1 415x  | —      | —        | —       | —     | —     | —        | —        | —           |
+| 4K-200-5%-v8     | 1 659 MB  | 76 KB          | 21 786x | —      | —        | —       | —     | —     | —        | —        | —           |
+| SAT-200-5%-v8    | 13 422 MB | 213 KB         | 62 968x | 6 942x | 30 255x  | 204x    | †     | †     | 105 545x | 251 629x | 2 173x      |
+| SAT-200-20%-v128 | 13 422 MB | 2 596 KB       | 5 171x  | 1 204x | 14 757x  | 89x     | †     | †     | 89 046x  | 290 779x | 857x        |
+| SAT-200-50%-v600 | 13 422 MB | 14 222 KB      | 944x    | —      | —        | —       | †     | †     | —        | —        | —           |
+
+- **Compact theor.** — sum of internal numpy buffer `nbytes`
+- **Mem x** — dense / compact theoretical ratio
+- **Area x / Filter x / Annot x / IoU x / NMS x / Merge x / Offset x / Centroids x** — compact speedup over dense for each operation
+- **†** — dense IoU+NMS skipped (dense array > 1 GB); compact still runs and is timed
+- **—** — not shown; full per-scenario tables are printed by the benchmark script
+
+All non-skipped scenarios pass: pixel-perfect annotation, exact area, lossless `to_dense()` roundtrip.
+
+---
+
+## Use-Cases
+
+- **Aerial / satellite imagery** — thousands of small objects on large tiles; dense masks exhaust RAM before inference completes.
+- **High-density crowd / cell segmentation** — N > 500 on FHD already requires several GB of mask storage per batch.
+- **Real-time annotation pipelines** — crop-paint cuts annotation from seconds to milliseconds at 4K resolution.
+- **Long-running tracking** — accumulated `Detections` across many frames stay in kilobytes rather than gigabytes.
+- **`InferenceSlicer`** — `with_offset()` adjusts crop origins directly when stitching tile results; no dense materialisation needed.
+
+---
+
+## Limitations
+
+- `CompactMask` is **not** a full `np.ndarray`. Call `.to_dense()` before passing to code that requires arbitrary ndarray methods (`astype`, `reshape`, `ravel`, `any`, `all`, …).
+- RLE format is **row-major (C-order), crop-scoped** — incompatible with pycocotools / COCO API RLEs (column-major, full-image-scoped). Use `.to_dense()` first if you need pycocotools interop.
+- `from_dense()` requires the input `(N, H, W)` array to fit in memory. For truly OOM-scale data, build `CompactMask` per-detection directly from model output crops rather than from a pre-allocated dense stack.
+
+---
+
+## Files
+
+| File           | Description                                      |
+| -------------- | ------------------------------------------------ |
+| `benchmark.py` | Full benchmark across FHD / 4K / satellite tiers |
+| `README.md`    | This file                                        |
diff --git a/examples/compact_mask/benchmark.py b/examples/compact_mask/benchmark.py
new file mode 100644
index 0000000000..7ef5b1372b
--- /dev/null
+++ b/examples/compact_mask/benchmark.py
@@ -0,0 +1,1097 @@
+"""CompactMask demo & benchmark.
+
+Demonstrates that ``CompactMask`` is a drop-in replacement for dense
+``(N, H, W)`` bool arrays in ``supervision.Detections``, while using
+significantly less memory and enabling faster annotation.
+
+Run with:
+    uv run python examples/compact_mask/benchmark.py
+
+No GPU or real model is required — everything is synthesized with NumPy.
+Mask complexity is controlled by ``num_vertices``: random polygons with more
+vertices produce jaggier boundaries and more RLE runs per row.
+"""
+
+from __future__ import annotations
+
+import dataclasses
+import gc
+import json
+import math
+import time
+import tracemalloc
+from concurrent.futures import ThreadPoolExecutor
+from dataclasses import dataclass, field
+from datetime import datetime, timezone
+from pathlib import Path
+from typing import Callable
+
+import cv2
+import numpy as np
+import pandas as pd
+from rich import box
+from rich.console import Console
+from rich.progress import (
+    BarColumn,
+    MofNCompleteColumn,
+    Progress,
+    TaskProgressColumn,
+    TextColumn,
+    TimeElapsedColumn,
+)
+from rich.table import Table
+
+import supervision as sv
+from supervision.detection.compact_mask import CompactMask
+
+console = Console(width=240, force_terminal=True)
+
+REPETITIONS = 4
+# How many reps to run concurrently in time_reps. Each thread times itself
+# independently; results are averaged. Numpy releases the GIL for its C-level
+# work so threads can truly run in parallel on multi-core machines.
+# Set to 1 to disable parallelism and revert to a sequential timing loop.
+PARALLEL = 3
+# Dense timing is skipped when the dense (N,H,W) array would exceed this
+# threshold — avoids OOM / swap thrashing on extreme scenarios while still
+# reporting the theoretical memory footprint.
+DENSE_SKIP_GB = 16.0
+# Dense IoU *and NMS* timing are skipped above this threshold: pairwise
+# (N,H,W) AND is extremely expensive — NMS calls IoU internally so both are
+# gated by the same threshold.
+IOU_DENSE_SKIP_GB = 1.0
+# Reps for dense IoU/NMS — a single pass already takes several seconds.
+IOU_NMS_REPS = 2
+
+
+# ══════════════════════════════════════════════════════════════════════════════
+# Result container
+# ══════════════════════════════════════════════════════════════════════════════
+
+
+@dataclass
+class ScenarioResult:
+    name: str
+    resolution: str  # e.g. "1920x1080"
+    num_objects: int
+    fill_name: str  # e.g. "5%"
+    num_vertices: int  # polygon vertex count — complexity proxy
+    # memory (theoretical: raw numpy nbytes)
+    dense_bytes: int
+    compact_bytes_theoretical: int
+    # memory (actual: tracemalloc peak; dense_bytes_actual=0 when dense_skipped=True)
+    dense_bytes_actual: int
+    compact_bytes_actual: int
+    # compactness overhead — absolute times for conversion (always measured)
+    encode_s: float  # CompactMask.from_dense()  dense → compact
+    decode_s: float  # compact_mask.to_dense()   compact → dense
+    # timing (nan when dense_skipped=True)
+    dense_area_s: float
+    compact_area_s: float
+    dense_filter_s: float
+    compact_filter_s: float
+    dense_annot_s: float
+    compact_annot_s: float
+    # pipeline stages (nan when respective skip flag is True)
+    dense_iou_s: float  # nan when iou_dense_skipped
+    compact_iou_s: float
+    dense_nms_s: float  # nan when dense_skipped
+    compact_nms_s: float
+    dense_merge_s: float  # nan when dense_skipped
+    compact_merge_s: float
+    dense_offset_s: float  # nan when dense_skipped
+    compact_offset_s: float
+    dense_centroids_s: float  # nan when dense_skipped
+    compact_centroids_s: float
+    # correctness (None when the stage was skipped)
+    pixel_perfect: bool | None
+    areas_match: bool | None
+    roundtrip_ok: bool | None
+    iou_ok: bool | None
+    nms_ok: bool | None
+    nms_mismatch_count: (
+        int  # detections with different NMS decisions (0 when dense_skipped)
+    )
+    merge_ok: bool | None
+    offset_ok: bool | None
+    centroids_ok: bool | None
+    # skip flags
+    dense_skipped: bool = field(default=False)
+    iou_dense_skipped: bool = field(default=False)
+
+
+# ══════════════════════════════════════════════════════════════════════════════
+# Synthetic data helpers
+# ══════════════════════════════════════════════════════════════════════════════
+
+
+def make_scene(image_height: int, image_width: int) -> np.ndarray:
+    """Random BGR image."""
+    return np.random.default_rng(42).integers(
+        0, 255, (image_height, image_width, 3), dtype=np.uint8
+    )
+
+
+def _make_polygon_mask(
+    image_height: int,
+    image_width: int,
+    center_x: int,
+    center_y: int,
+    axis_x: int,
+    axis_y: int,
+    rng: np.random.Generator,
+    num_vertices: int,
+) -> np.ndarray:
+    """Random polygon mask.
+
+    *num_vertices* is a direct complexity proxy: more vertices → more
+    independent radius samples → jaggier boundary → more RLE runs per row.
+    No smoothing is applied so the relationship is monotone.
+    """
+    angles = np.sort(rng.uniform(0, 2 * np.pi, num_vertices))
+    radii = rng.uniform(0.3, 1.0, num_vertices)
+    pts_x = np.clip(
+        (center_x + axis_x * radii * np.cos(angles)).astype(np.int32),
+        0,
+        image_width - 1,
+    )
+    pts_y = np.clip(
+        (center_y + axis_y * radii * np.sin(angles)).astype(np.int32),
+        0,
+        image_height - 1,
+    )
+    pts = np.column_stack([pts_x, pts_y]).reshape(-1, 1, 2)
+    canvas = np.zeros((image_height, image_width), dtype=np.uint8)
+    cv2.fillPoly(canvas, [pts], 1)
+    return canvas.astype(bool)
+
+
+def make_detections(
+    num_objects: int,
+    image_height: int,
+    image_width: int,
+    fill_fraction: float,
+    num_vertices: int = 20,
+    seed: int = 0,
+) -> tuple[np.ndarray, np.ndarray, np.ndarray]:
+    """Return ``(xyxy, masks_dense, class_ids)`` with random polygon masks.
+
+    *num_vertices* controls mask complexity: more vertices → jaggier boundary.
+    """
+    rng = np.random.default_rng(seed)
+    half = max(
+        2,
+        int(
+            (image_height * image_width * fill_fraction / (np.pi * num_objects)) ** 0.5
+        ),
+    )
+    xyxy_list = []
+    masks = np.zeros((num_objects, image_height, image_width), dtype=bool)
+    for index in range(num_objects):
+        center_x = int(rng.integers(half + 1, image_width - half - 1))
+        center_y = int(rng.integers(half + 1, image_height - half - 1))
+        axis_x = int(rng.integers(max(2, half // 2), half * 2 + 1))
+        axis_y = int(rng.integers(max(2, half // 2), half * 2 + 1))
+        masks[index] = _make_polygon_mask(
+            image_height,
+            image_width,
+            center_x,
+            center_y,
+            axis_x,
+            axis_y,
+            rng,
+            num_vertices,
+        )
+        xyxy_list.append(
+            [
+                max(0, center_x - axis_x),
+                max(0, center_y - axis_y),
+                min(image_width - 1, center_x + axis_x),
+                min(image_height - 1, center_y + axis_y),
+            ]
+        )
+    xyxy = np.array(xyxy_list, dtype=np.float32)
+    class_ids = rng.integers(0, 10, num_objects, dtype=int)
+    return xyxy, masks, class_ids
+
+
+# ══════════════════════════════════════════════════════════════════════════════
+# Memory helpers
+# ══════════════════════════════════════════════════════════════════════════════
+
+
+def dense_memory_bytes(masks: np.ndarray) -> int:
+    """Theoretical dense footprint: raw numpy buffer size."""
+    return int(masks.nbytes)
+
+
+def compact_memory_bytes_theoretical(compact_mask: CompactMask) -> int:
+    """Theoretical compact footprint: sum of all internal numpy buffer sizes."""
+    return int(
+        compact_mask._crop_shapes.nbytes
+        + compact_mask._offsets.nbytes
+        + sum(rle.nbytes for rle in compact_mask._rles),
+    )
+
+
+def measure_peak_bytes(func: Callable[[], object]) -> int:
+    """Wrapper that runs *func* under tracemalloc and returns peak allocation.
+
+    tracemalloc captures every Python-level allocation — numpy buffers, list
+    nodes, object headers — giving the true heap cost of anything *func*
+    builds. The return value of *func* is discarded so the object does not
+    stay alive.
+    """
+    tracemalloc.start()
+    tracemalloc.clear_traces()
+    func()
+    _, peak = tracemalloc.get_traced_memory()
+    tracemalloc.stop()
+    return int(peak)
+
+
+def dense_memory_bytes_actual(
+    num_objects: int, image_height: int, image_width: int
+) -> int:
+    """Actual dense footprint: peak bytes during (N, H, W) bool array alloc."""
+    return measure_peak_bytes(
+        lambda: np.zeros((num_objects, image_height, image_width), dtype=bool),
+    )
+
+
+def compact_memory_bytes_actual(
+    masks_dense: np.ndarray,
+    xyxy: np.ndarray,
+    image_shape: tuple[int, int],
+) -> int:
+    """Actual compact footprint: peak bytes during CompactMask.from_dense()."""
+    return measure_peak_bytes(
+        lambda: CompactMask.from_dense(masks_dense, xyxy, image_shape=image_shape),
+    )
+
+
+def time_reps(
+    func: Callable[[], object],
+    repeats: int = REPETITIONS,
+    parallel: int = PARALLEL,
+) -> float:
+    """Run *func* *reps* times and return mean wall-clock seconds per call.
+
+    When ``parallel > 1``, up to ``parallel`` calls run simultaneously in
+    threads. Numpy and OpenCV release the GIL for their C-level work, so
+    threads can execute in parallel on multi-core machines. Each thread
+    records its own elapsed time; the mean across all *reps* is returned.
+
+    When ``parallel == 1`` the original sequential loop is used, avoiding
+    any thread-scheduling overhead and improving accuracy for cheap functions.
+
+    A full GC cycle is run before timing so accumulated garbage from earlier
+    stages does not trigger collection mid-measurement and inflate results.
+    """
+    gc.collect()
+    if parallel <= 1:
+        t0 = time.perf_counter()
+        for _ in range(repeats):
+            func()
+        return (time.perf_counter() - t0) / repeats
+
+    def _timed() -> float:
+        t0 = time.perf_counter()
+        func()
+        return time.perf_counter() - t0
+
+    with ThreadPoolExecutor(max_workers=min(parallel, repeats)) as pool:
+        timings = list(pool.map(lambda _: _timed(), range(repeats)))
+    return sum(timings) / repeats
+
+
+# ══════════════════════════════════════════════════════════════════════════════
+# Benchmark stages
+# ══════════════════════════════════════════════════════════════════════════════
+
+
+def stage_build(
+    num_objects: int,
+    image_height: int,
+    image_width: int,
+    fill_fraction: float,
+    num_vertices: int = 20,
+) -> tuple[np.ndarray, np.ndarray, np.ndarray, CompactMask]:
+    """Synthesize polygon masks and build the CompactMask."""
+    xyxy, masks_dense, class_ids = make_detections(
+        num_objects, image_height, image_width, fill_fraction, num_vertices
+    )
+    compact_mask = CompactMask.from_dense(
+        masks_dense, xyxy, image_shape=(image_height, image_width)
+    )
+    return xyxy, masks_dense, class_ids, compact_mask
+
+
+def stage_encode(
+    masks_dense: np.ndarray,
+    xyxy: np.ndarray,
+    image_height: int,
+    image_width: int,
+) -> float:
+    """Per-mask encode time: encode each mask individually and average over N.
+
+    Calling from_dense one mask at a time (rather than batching all N) isolates
+    the per-shape cost — each polygon has a different RLE run count, so the
+    average reflects true shape variance.
+    """
+    num_masks = len(masks_dense)
+    image_shape = (image_height, image_width)
+
+    def _encode_each() -> None:
+        for i in range(num_masks):
+            CompactMask.from_dense(
+                masks_dense[i : i + 1], xyxy[i : i + 1], image_shape=image_shape
+            )
+
+    return time_reps(_encode_each) / max(num_masks, 1)
+
+
+def stage_decode(compact_mask: CompactMask) -> float:
+    """Per-mask decode time: decode each mask individually and average over N.
+
+    Building a list via compact_mask[i] decodes each crop separately, giving
+    the per-mask cost of materialising a single RLE back to a dense array.
+    """
+    num_masks = len(compact_mask)
+    return time_reps(lambda: [compact_mask[i] for i in range(num_masks)]) / max(
+        num_masks, 1
+    )
+
+
+def stage_area(
+    det_dense: sv.Detections, det_compact: sv.Detections
+) -> tuple[float, float]:
+    """Time .area on both representations."""
+    return (
+        time_reps(lambda: det_dense.area),
+        time_reps(lambda: det_compact.area),
+    )
+
+
+def stage_filter(
+    det_dense: sv.Detections, det_compact: sv.Detections
+) -> tuple[float, float]:
+    """Time boolean filtering (keep every other detection)."""
+    keep = np.arange(len(det_dense)) % 2 == 0
+    return (
+        time_reps(lambda: det_dense[keep]),
+        time_reps(lambda: det_compact[keep]),
+    )
+
+
+def stage_annotate(
+    scene: np.ndarray, det_dense: sv.Detections, det_compact: sv.Detections
+) -> tuple[float, float]:
+    """Time MaskAnnotator on both representations."""
+    annotator = sv.MaskAnnotator(opacity=0.5)
+    return (
+        time_reps(lambda: annotator.annotate(scene.copy(), det_dense)),
+        time_reps(lambda: annotator.annotate(scene.copy(), det_compact)),
+    )
+
+
+def stage_correctness(
+    scene: np.ndarray,
+    masks_dense: np.ndarray,
+    compact_mask: CompactMask,
+    det_dense: sv.Detections,
+    det_compact: sv.Detections,
+) -> tuple[bool, bool, bool]:
+    """Return (pixel_perfect, areas_match, roundtrip_ok)."""
+    annotator = sv.MaskAnnotator(opacity=0.5)
+    out_dense = annotator.annotate(scene.copy(), det_dense)
+    out_compact = annotator.annotate(scene.copy(), det_compact)
+    pixel_perfect = bool(np.array_equal(out_dense, out_compact))
+    areas_match = bool(np.allclose(det_dense.area, det_compact.area))
+    roundtrip_ok = bool(np.array_equal(compact_mask.to_dense(), masks_dense))
+    return pixel_perfect, areas_match, roundtrip_ok
+
+
+def stage_iou(
+    masks_dense: np.ndarray,
+    compact_mask: CompactMask,
+    iou_dense_skipped: bool,
+) -> tuple[float, float, bool | None]:
+    """Time pairwise self-IoU using dense (N,H,W) AND and compact crop filter.
+
+    Correctness is checked on the first 10 masks only to keep it fast,
+    regardless of whether full dense IoU timing is skipped.
+    """
+    correct_n = min(len(compact_mask), 10)
+    iou_compact_small = sv.mask_iou_batch(
+        compact_mask[:correct_n], compact_mask[:correct_n]
+    )
+    iou_dense_small = sv.mask_iou_batch(
+        masks_dense[:correct_n], masks_dense[:correct_n]
+    )
+    iou_ok = bool(np.allclose(iou_dense_small, iou_compact_small, atol=1e-4))
+
+    compact_iou_s = time_reps(lambda: sv.mask_iou_batch(compact_mask, compact_mask))
+    if iou_dense_skipped:
+        dense_iou_s = math.nan
+    else:
+        dense_iou_s = time_reps(
+            lambda: sv.mask_iou_batch(masks_dense, masks_dense),
+            repeats=IOU_NMS_REPS,
+        )
+    return dense_iou_s, compact_iou_s, iou_ok
+
+
+def stage_nms(
+    xyxy: np.ndarray,
+    confidence: np.ndarray,
+    class_ids: np.ndarray,
+    masks_dense: np.ndarray,
+    compact_mask: CompactMask,
+    dense_skipped: bool,
+    iou_dense_skipped: bool,
+) -> tuple[float, float, bool | None, int]:
+    """Time mask NMS. Dense resizes to 640 before IoU; compact uses exact crop IoU.
+
+    Compact NMS is strictly more accurate than dense: it computes pixel-level IoU
+    directly on the full-resolution RLE crops instead of a lossy 640px-downsampled
+    approximation.  For pairs whose true IoU is very close to the 0.5 threshold,
+    the resize step in the dense path can flip a keep/suppress decision.
+
+    ``n_diff`` counts detections whose decision differs between the two paths.
+    ``nms_ok`` is True when ``n_diff == 0``.
+
+    Dense NMS is skipped when ``dense_skipped`` *or* ``iou_dense_skipped`` is True:
+    NMS calls mask_iou_batch internally so the cost is the same as IoU.
+
+    Returns:
+        Tuple of ``(dense_nms_s, compact_nms_s, nms_ok, n_diff)``.
+    """
+    predictions = np.c_[xyxy, confidence, class_ids.astype(float)]
+
+    compact_nms_s = time_reps(
+        lambda: sv.mask_non_max_suppression(predictions, compact_mask)
+    )
+    if dense_skipped or iou_dense_skipped:
+        return math.nan, compact_nms_s, None, 0
+
+    keep_dense = sv.mask_non_max_suppression(predictions, masks_dense)
+    keep_compact = sv.mask_non_max_suppression(predictions, compact_mask)
+    n_diff = int(np.sum(keep_dense != keep_compact))
+    nms_ok = n_diff == 0
+    dense_nms_s = time_reps(
+        lambda: sv.mask_non_max_suppression(predictions, masks_dense),
+        repeats=IOU_NMS_REPS,
+    )
+    return dense_nms_s, compact_nms_s, nms_ok, n_diff
+
+
+def stage_merge(
+    det_dense: sv.Detections | None,
+    det_compact: sv.Detections,
+    dense_skipped: bool,
+) -> tuple[float, float, bool | None]:
+    """Time Detections.merge on two half-splits.
+
+    Dense: np.vstack; compact: RLE concat.
+    Splits are pre-computed so the timed lambda measures only the merge.
+    """
+    half = len(det_compact) // 2
+    compact_a, compact_b = det_compact[:half], det_compact[half:]
+
+    compact_merge_s = time_reps(lambda: sv.Detections.merge([compact_a, compact_b]))
+    if dense_skipped or det_dense is None:
+        return math.nan, compact_merge_s, None
+
+    dense_a, dense_b = det_dense[:half], det_dense[half:]
+    merged_d = sv.Detections.merge([dense_a, dense_b])
+    merged_c = sv.Detections.merge([compact_a, compact_b])
+    merge_ok = bool(np.allclose(merged_d.area, merged_c.area))
+    dense_merge_s = time_reps(lambda: sv.Detections.merge([dense_a, dense_b]))
+    return dense_merge_s, compact_merge_s, merge_ok
+
+
+def stage_offset(
+    masks_dense: np.ndarray,
+    compact_mask: CompactMask,
+    image_height: int,
+    image_width: int,
+    dense_skipped: bool,
+) -> tuple[float, float, bool | None]:
+    """Time mask offset: move_masks (N,H,W) copy vs O(N) offset update."""
+    dx, dy = 10, 10
+    # Expand the canvas by the offset so no shifted crop overflows boundary.
+    # Both move_masks and with_offset.to_dense() operate on identical space.
+    new_h, new_w = image_height + dy, image_width + dx
+    new_shape = (new_h, new_w)
+
+    compact_offset_s = time_reps(
+        lambda: compact_mask.with_offset(dx, dy, new_image_shape=new_shape)
+    )
+    if dense_skipped:
+        return math.nan, compact_offset_s, None
+
+    moved_dense = sv.move_masks(
+        masks_dense, np.array([dx, dy]), resolution_wh=(new_w, new_h)
+    )
+    moved_compact = compact_mask.with_offset(
+        dx, dy, new_image_shape=new_shape
+    ).to_dense()
+    offset_ok = bool(np.array_equal(moved_dense, moved_compact))
+    dense_offset_s = time_reps(
+        lambda: sv.move_masks(
+            masks_dense, np.array([dx, dy]), resolution_wh=(new_w, new_h)
+        )
+    )
+    return dense_offset_s, compact_offset_s, offset_ok
+
+
+def stage_centroids(
+    masks_dense: np.ndarray,
+    compact_mask: CompactMask,
+    dense_skipped: bool,
+) -> tuple[float, float, bool | None]:
+    """Time centroid: np.tensordot on full stack (dense) vs per-crop (compact)."""
+    compact_centroids_s = time_reps(lambda: sv.calculate_masks_centroids(compact_mask))
+    if dense_skipped:
+        return math.nan, compact_centroids_s, None
+
+    c_dense = sv.calculate_masks_centroids(masks_dense)
+    c_compact = sv.calculate_masks_centroids(compact_mask)
+    centroids_ok = bool(np.allclose(c_dense, c_compact, atol=1.0))  # 1-pixel tolerance
+    dense_centroids_s = time_reps(lambda: sv.calculate_masks_centroids(masks_dense))
+    return dense_centroids_s, compact_centroids_s, centroids_ok
+
+
+# ══════════════════════════════════════════════════════════════════════════════
+# Scenario runner — orchestrates stages
+# ══════════════════════════════════════════════════════════════════════════════
+
+
+def run_scenario(
+    name: str,
+    num_objects: int,
+    image_height: int,
+    image_width: int,
+    fill_fraction: float = 0.10,
+    num_vertices: int = 20,
+) -> ScenarioResult:
+    resolution = f"{image_width}x{image_height}"
+    fill_name = f"{fill_fraction:.0%}"
+    console.rule(
+        f"[bold]{name}[/bold] | {num_objects} objects · {resolution} "
+        f"· fill≈{fill_name} · polygon/{num_vertices} vertices"
+    )
+
+    xyxy, masks_dense, class_ids, compact_mask = stage_build(
+        num_objects, image_height, image_width, fill_fraction, num_vertices
+    )
+    scene = make_scene(image_height, image_width)
+
+    # ── memory ──────────────────────────────────────────────────────────────
+    dense_bytes = dense_memory_bytes(masks_dense)
+    dense_skipped = dense_bytes > DENSE_SKIP_GB * 1e9
+    compact_theoretical = compact_memory_bytes_theoretical(compact_mask)
+
+    # Only measure dense tracemalloc when it's safe to allocate the full array.
+    dense_actual = (
+        0
+        if dense_skipped
+        else dense_memory_bytes_actual(num_objects, image_height, image_width)
+    )
+    compact_actual = compact_memory_bytes_actual(
+        masks_dense, xyxy, (image_height, image_width)
+    )
+
+    encode_s = stage_encode(masks_dense, xyxy, image_height, image_width)
+    decode_s = stage_decode(compact_mask)
+
+    theory_ratio = dense_bytes / max(compact_theoretical, 1)
+    if dense_skipped:
+        malloc_ratio_str = "[dim]—[/dim]"
+        dense_actual_str = "[dim]skipped[/dim]"
+    else:
+        malloc_ratio = dense_actual / max(compact_actual, 1)
+        malloc_ratio_str = _fmt_ratio(malloc_ratio)
+        dense_actual_str = f"{dense_actual / 1e6:.1f} MB"
+    console.print(
+        f"\tmemory >>\n"
+        f"\t\ttheory :: dense={dense_bytes / 1e6:.1f} MB "
+        f"| compact={compact_theoretical / 1e3:.0f} KB "
+        f"\t{_fmt_ratio(theory_ratio)}\n"
+        f"\t\tmalloc :: dense={dense_actual_str} "
+        f"| compact={compact_actual / 1e3:.0f} KB "
+        f"\t{malloc_ratio_str}"
+    )
+    console.print(f"\t<create> encode (from_dense)\t={encode_s * 1e3:.3f} ms/mask")
+    console.print(f"\t<export> decode (to_dense)\t={decode_s * 1e3:.3f} ms/mask")
+
+    # ── skip flags ──────────────────────────────────────────────────────────
+    iou_dense_skipped = dense_bytes > IOU_DENSE_SKIP_GB * 1e9
+    if dense_skipped:
+        console.print(
+            f"\t[yellow]dense array is {dense_bytes / 1e9:.1f} GB "
+            f"(>{DENSE_SKIP_GB:.0f} GB threshold) — skipping dense timing"
+            f"[/yellow]"
+        )
+    elif iou_dense_skipped:
+        console.print(
+            f"\t[yellow]dense IoU skipped (>{IOU_DENSE_SKIP_GB:.0f}GB thr.)[/yellow]"
+        )
+
+    confidence = (
+        np.random.default_rng(1).uniform(0.3, 0.99, num_objects).astype(np.float32)
+    )
+    det_compact = sv.Detections(xyxy=xyxy, mask=compact_mask, class_id=class_ids)
+
+    if dense_skipped:
+        dense_area_s = dense_filter_s = dense_annot_s = math.nan
+        compact_area_s = _time_compact_area(det_compact)
+        compact_filter_s = _time_compact_filter(det_compact)
+        compact_annot_s = _time_compact_annotate(scene, det_compact)
+        pixel_perfect = areas_match = roundtrip_ok = None
+        det_dense = None
+    else:
+        det_dense = sv.Detections(xyxy=xyxy, mask=masks_dense, class_id=class_ids)
+        dense_area_s, compact_area_s = stage_area(det_dense, det_compact)
+        dense_filter_s, compact_filter_s = stage_filter(det_dense, det_compact)
+        dense_annot_s, compact_annot_s = stage_annotate(scene, det_dense, det_compact)
+        pixel_perfect, areas_match, roundtrip_ok = stage_correctness(
+            scene, masks_dense, compact_mask, det_dense, det_compact
+        )
+
+    dense_iou_s, compact_iou_s, iou_ok = stage_iou(
+        masks_dense, compact_mask, iou_dense_skipped
+    )
+    dense_nms_s, compact_nms_s, nms_ok, nms_diff = stage_nms(
+        xyxy,
+        confidence,
+        class_ids,
+        masks_dense,
+        compact_mask,
+        dense_skipped,
+        iou_dense_skipped,
+    )
+    dense_merge_s, compact_merge_s, merge_ok = stage_merge(
+        det_dense, det_compact, dense_skipped
+    )
+    dense_offset_s, compact_offset_s, offset_ok = stage_offset(
+        masks_dense, compact_mask, image_height, image_width, dense_skipped
+    )
+    dense_centroids_s, compact_centroids_s, centroids_ok = stage_centroids(
+        masks_dense, compact_mask, dense_skipped
+    )
+
+    def _timing_line(label: str, dense_s: float, compact_s: float) -> str:
+        compact_ms = f"{compact_s * 1e3:.2f} ms"
+        if math.isnan(dense_s):
+            return (
+                f"\t{label}\t -> dense=[dim]—[/dim]"
+                f"\t\t | compact={compact_ms}\t | speedup=[dim]—[/dim]"
+            )
+        dense_ms = f"{dense_s * 1e3:.2f} ms"
+        speedup = _fmt_ratio(dense_s / max(compact_s, 1e-9))
+        return (
+            f"\t{label}\t -> dense={dense_ms}\t | "
+            f"compact={compact_ms}\t | speedup={speedup}"
+        )
+
+    console.print(_timing_line(".area    ", dense_area_s, compact_area_s))
+    console.print(_timing_line("annotate ", dense_annot_s, compact_annot_s))
+    console.print(_timing_line("centroids", dense_centroids_s, compact_centroids_s))
+    console.print(_timing_line("filter   ", dense_filter_s, compact_filter_s))
+    console.print(_timing_line("iou      ", dense_iou_s, compact_iou_s))
+    console.print(_timing_line("merge    ", dense_merge_s, compact_merge_s))
+    console.print(_timing_line("nms      ", dense_nms_s, compact_nms_s))
+    console.print(_timing_line("offset   ", dense_offset_s, compact_offset_s))
+
+    checks = {
+        "pixel-perfect": pixel_perfect,
+        "areas": areas_match,
+        "roundtrip": roundtrip_ok,
+        "iou": iou_ok,
+        "nms": nms_ok,
+        "merge": merge_ok,
+        "offset": offset_ok,
+        "centroids": centroids_ok,
+    }
+    parts = []
+    for k, v in checks.items():
+        if k == "nms" and v is False:
+            parts.append(f"nms=[red]✗({nms_diff})[/red]")
+        else:
+            parts.append(
+                f"{k}="
+                + (
+                    "[dim]—[/dim]"
+                    if v is None
+                    else "[green]✓[/green]"
+                    if v
+                    else "[red]✗[/red]"
+                )
+            )
+    all_checked = [v for v in checks.values() if v is not None]
+    overall = (
+        "[green]✓ all correct[/green]"
+        if all_checked and all(all_checked)
+        else "[red]✗ MISMATCH[/red]"
+        if any(v is False for v in checks.values())
+        else "[dim]—[/dim]"
+    )
+    console.print("  correctness >> " + " | ".join(parts) + f" | {overall}")
+
+    return ScenarioResult(
+        name=name,
+        resolution=resolution,
+        num_objects=num_objects,
+        fill_name=fill_name,
+        num_vertices=num_vertices,
+        dense_bytes=dense_bytes,
+        compact_bytes_theoretical=compact_theoretical,
+        dense_bytes_actual=dense_actual,
+        compact_bytes_actual=compact_actual,
+        encode_s=encode_s,
+        decode_s=decode_s,
+        dense_area_s=dense_area_s,
+        compact_area_s=compact_area_s,
+        dense_filter_s=dense_filter_s,
+        compact_filter_s=compact_filter_s,
+        dense_annot_s=dense_annot_s,
+        compact_annot_s=compact_annot_s,
+        dense_iou_s=dense_iou_s,
+        compact_iou_s=compact_iou_s,
+        dense_nms_s=dense_nms_s,
+        compact_nms_s=compact_nms_s,
+        dense_merge_s=dense_merge_s,
+        compact_merge_s=compact_merge_s,
+        dense_offset_s=dense_offset_s,
+        compact_offset_s=compact_offset_s,
+        dense_centroids_s=dense_centroids_s,
+        compact_centroids_s=compact_centroids_s,
+        pixel_perfect=pixel_perfect,
+        areas_match=areas_match,
+        roundtrip_ok=roundtrip_ok,
+        iou_ok=iou_ok,
+        nms_ok=nms_ok,
+        nms_mismatch_count=nms_diff,
+        merge_ok=merge_ok,
+        offset_ok=offset_ok,
+        centroids_ok=centroids_ok,
+        dense_skipped=dense_skipped,
+        iou_dense_skipped=iou_dense_skipped,
+    )
+
+
+def _time_compact_area(det_compact: sv.Detections) -> float:
+    """Time .area on the compact detections (used when dense timing is skipped)."""
+    return time_reps(lambda: det_compact.area)
+
+
+def _time_compact_filter(det_compact: sv.Detections) -> float:
+    """Time boolean-index filtering on the compact detections (dense-skip path)."""
+    keep = np.arange(len(det_compact)) % 2 == 0
+    return time_reps(lambda: det_compact[keep])
+
+
+def _time_compact_annotate(scene: np.ndarray, det_compact: sv.Detections) -> float:
+    """Time MaskAnnotator on the compact detections (dense-skip path)."""
+    annotator = sv.MaskAnnotator(opacity=0.5)
+    return time_reps(lambda: annotator.annotate(scene.copy(), det_compact))
+
+
+# ══════════════════════════════════════════════════════════════════════════════
+# Rich summary table
+# ══════════════════════════════════════════════════════════════════════════════
+
+_OPS = ("area", "filter", "annot", "iou", "nms", "merge", "offset", "centroids")
+
+
+def _build_summary_df(results: list[ScenarioResult]) -> pd.DataFrame:
+    """Compute derived summary columns from scenario results.
+
+    Returns a DataFrame with all ScenarioResult fields plus derived columns
+    (ratios, speedups, ok) as raw floats.  Consumers apply their own formatting.
+    """
+    df = pd.DataFrame([dataclasses.asdict(r) for r in results])
+    df["ratio_theory"] = df["dense_bytes"] / df["compact_bytes_theoretical"].clip(
+        lower=1
+    )
+    df["ratio_malloc"] = df["dense_bytes_actual"] / df["compact_bytes_actual"].clip(
+        lower=1
+    )
+    # dense_bytes_actual == 0 (not measured) when dense_skipped — clear those cells
+    df.loc[df["dense_skipped"], "ratio_malloc"] = None
+    for op in _OPS:
+        df[f"{op}_speedup"] = df[f"dense_{op}_s"] / df[f"compact_{op}_s"].clip(
+            lower=1e-9
+        )
+
+    check_cols = [
+        "pixel_perfect",
+        "areas_match",
+        "roundtrip_ok",
+        "iou_ok",
+        "nms_ok",
+        "merge_ok",
+        "offset_ok",
+        "centroids_ok",
+    ]
+    df["ok"] = df.apply(
+        lambda row: (
+            False
+            if any(row[c] is False for c in check_cols)
+            else True
+            if any(row[c] is True for c in check_cols)
+            else None
+        ),
+        axis=1,
+    )
+    return df
+
+
+def _fmt_ratio(ratio: float) -> str:
+    """Format a speedup/compression ratio with colour coding.
+
+    ≥10 → green (large win), 1-10 → yellow (modest win), <1 → red (regression).
+    Integer for ≥10, two decimals otherwise.
+    """
+    fmt = f"{ratio:.0f}x" if ratio >= 10 else f"{ratio:.2f}x"
+    if ratio >= 10:
+        return f"[green]{fmt}[/green]"
+    elif ratio >= 1:
+        return f"[yellow]{fmt}[/yellow]"
+    else:
+        return f"[red]{fmt}[/red]"
+
+
+def _fmt_speedup(dense_s: float, compact_s: float) -> str:
+    if math.isnan(dense_s):
+        # Dense was skipped — show compact absolute time so the column isn't empty.
+        return f"[dim]{compact_s * 1e3:.1f} ms[/dim]"
+    return _fmt_ratio(dense_s / max(compact_s, 1e-9))
+
+
+def print_summary(results: list[ScenarioResult]) -> None:
+    table = Table(
+        title="CompactMask — benchmark summary",
+        box=box.ROUNDED,
+        show_lines=True,
+        header_style="bold cyan",
+        min_width=console.width,
+    )
+    table.add_column("Scenario", style="bold", min_width=22)
+    table.add_column("Objects", justify="right", min_width=7)
+    table.add_column("Resolution", min_width=12, no_wrap=True)
+    table.add_column("Fill", justify="right", min_width=5, no_wrap=True)
+    table.add_column("Vertices", justify="right", min_width=8, no_wrap=True)
+    table.add_column("Dense\ntheory", justify="right", min_width=10)
+    table.add_column("Compact\ntheory", justify="right", style="green", min_width=9)
+    table.add_column("Ratio\ntheory", justify="right", min_width=7)
+    table.add_column("Dense\nmalloc", justify="right", style="cyan", min_width=9)
+    table.add_column("Compact\nmalloc", justify="right", style="cyan", min_width=9)
+    table.add_column("Ratio\nmalloc", justify="right", min_width=7)
+    table.add_column("Encode\n(ms/mask)", justify="right", style="yellow", min_width=7)
+    table.add_column("Decode\n(ms/mask)", justify="right", style="yellow", min_width=7)
+    table.add_column("Area\natt.", justify="right", min_width=6)
+    table.add_column("Filter\nop.", justify="right", min_width=6)
+    table.add_column("Annot\nop.", justify="right", min_width=6)
+    table.add_column("IoU\nop.", justify="right", min_width=6)
+    table.add_column("NMS\nop.", justify="right", min_width=6)
+    table.add_column("Merge\nop.", justify="right", min_width=6)
+    table.add_column("Offset\nop.", justify="right", min_width=6)
+    table.add_column("Centr\nop.", justify="right", min_width=6)
+    table.add_column("OK?", justify="center", min_width=4)
+
+    for _, row in _build_summary_df(results).iterrows():
+        ok = row["ok"]
+        ok_cell = (
+            "[red]✗[/red]"
+            if ok is False
+            else "[green]✓[/green]"
+            if ok is True
+            else "[dim]—[/dim]"
+        )
+        dense_malloc_cell = (
+            "[dim]—[/dim]"
+            if row["dense_skipped"]
+            else f"{row['dense_bytes_actual'] / 1e6:.1f} MB"
+        )
+        malloc_ratio_cell = (
+            "[dim]—[/dim]" if row["dense_skipped"] else _fmt_ratio(row["ratio_malloc"])
+        )
+        table.add_row(
+            row["name"],
+            str(row["num_objects"]),
+            row["resolution"],
+            row["fill_name"],
+            str(row["num_vertices"]),
+            f"{row['dense_bytes'] / 1e6:.1f} MB",
+            f"{row['compact_bytes_theoretical'] / 1e3:.0f} KB",
+            _fmt_ratio(row["ratio_theory"]),
+            dense_malloc_cell,
+            f"{row['compact_bytes_actual'] / 1e3:.0f} KB",
+            malloc_ratio_cell,
+            f"{row['encode_s'] * 1e3:.1f}",
+            f"{row['decode_s'] * 1e3:.1f}",
+            _fmt_speedup(row["dense_area_s"], row["compact_area_s"]),
+            _fmt_speedup(row["dense_filter_s"], row["compact_filter_s"]),
+            _fmt_speedup(row["dense_annot_s"], row["compact_annot_s"]),
+            _fmt_speedup(row["dense_iou_s"], row["compact_iou_s"]),
+            _fmt_speedup(row["dense_nms_s"], row["compact_nms_s"]),
+            _fmt_speedup(row["dense_merge_s"], row["compact_merge_s"]),
+            _fmt_speedup(row["dense_offset_s"], row["compact_offset_s"]),
+            _fmt_speedup(row["dense_centroids_s"], row["compact_centroids_s"]),
+            ok_cell,
+        )
+
+    console.print(table)
+    console.print(
+        "[dim]"
+        + "  ·  ".join(
+            [
+                "Vertices — polygon vertex count "
+                "(complexity proxy: more = jaggier boundary)",
+                "Dense theory — NxHxW bytes (raw numpy buffer)",
+                "Compact theory — sum of internal numpy buffer sizes",
+                "Ratio (theory) — dense / compact theoretical ratio",
+                "Dense malloc — tracemalloc peak during np.zeros allocation",
+                "Compact malloc — tracemalloc peak during .from_dense()",
+                "Ratio (malloc) — dense / compact tracemalloc peak ratio",
+                "Encode ms/mask — from_dense() / N (dense→compact overhead per mask)",
+                "Decode ms/mask — to_dense() / N (compact→dense overhead per mask)",
+                "Area x — .area speedup (RLE sum, no materialisation)",
+                "Filter x — boolean-index speedup",
+                "Annot x — MaskAnnotator speedup (crop-paint vs full-frame alloc)",
+                f"IoU x — pairwise self-IoU speedup "
+                f"(dense skipped >{IOU_DENSE_SKIP_GB:.0f} GB)",
+                "NMS x — mask_non_max_suppression speedup",
+                "Merge x — Detections.merge speedup",
+                "Offset x — move_masks vs with_offset speedup",
+                "Centroids x — calculate_masks_centroids speedup",
+                "dim ms — dense skipped, compact absolute time shown",
+            ]
+        )
+        + "[/dim]"
+    )
+
+
+# ══════════════════════════════════════════════════════════════════════════════
+# Results persistence
+# ══════════════════════════════════════════════════════════════════════════════
+
+
+def _append_result(result: ScenarioResult, path: Path) -> None:
+    """Append one scenario result as a JSON line to *path*.
+
+    ``math.nan`` (used for skipped dense timings) is serialised as ``null``
+    so the file is valid JSON-Lines and can be read back with any JSON parser.
+    """
+    row = {
+        k: (None if isinstance(v, float) and math.isnan(v) else v)
+        for k, v in dataclasses.asdict(result).items()
+    }
+    with path.open("a", encoding="utf-8") as fh:
+        fh.write(json.dumps(row) + "\n")
+
+
+def save_results_csv(results: list[ScenarioResult], path: Path) -> None:
+    """Write the summary table to *path* as a CSV file.
+
+    Each row mirrors the Rich summary table: scenario metadata, memory ratios,
+    encode/decode overhead, and per-operation speedups. Columns whose dense
+    timing was skipped are written as empty cells.
+    """
+    df = _build_summary_df(results)
+    pd.DataFrame(
+        {
+            "scenario": df["name"],
+            "objects": df["num_objects"],
+            "resolution": df["resolution"],
+            "fill": df["fill_name"],
+            "vertices": df["num_vertices"],
+            "dense_theory_mb": (df["dense_bytes"] / 1e6).round(1),
+            "compact_theory_kb": (df["compact_bytes_theoretical"] / 1e3).round(1),
+            "ratio_theory": df["ratio_theory"].round(0),
+            "dense_malloc_mb": (df["dense_bytes_actual"] / 1e6)
+            .where(~df["dense_skipped"])
+            .round(1),
+            "compact_malloc_kb": (df["compact_bytes_actual"] / 1e3).round(1),
+            "ratio_malloc": df["ratio_malloc"].round(0),
+            "encode_ms_per_mask": (df["encode_s"] * 1e3).round(4),
+            "decode_ms_per_mask": (df["decode_s"] * 1e3).round(4),
+            **{f"{op}_speedup": df[f"{op}_speedup"].round(2) for op in _OPS},
+            "ok": df["ok"],
+        }
+    ).to_csv(path, index=False)
+
+
+# ══════════════════════════════════════════════════════════════════════════════
+# Entry point
+# ══════════════════════════════════════════════════════════════════════════════
+
+
+def main() -> None:
+    # ── parameter matrix ──────────────────────────────────────────────────────
+    # (tier_label, (image_width, image_height), num_objects)
+    TIERS: list[tuple[str, tuple[int, int], int]] = [
+        ("FHD", (1920, 1080), 100),  # full comparison  (0.21 GB < 1 GB IoU thr.)
+        ("FHD", (1920, 1080), 200),  # full comparison  (0.41 GB < 1 GB IoU thr.)
+        ("FHD", (1920, 1080), 400),  # full comparison  (0.83 GB < 1 GB IoU thr.)
+        ("4K", (3840, 2160), 100),  # full comparison  (0.83 GB < 1 GB IoU thr.)
+        ("4K", (3840, 2160), 200),  # dense excl. IoU/NMS  (1.66 GB > 1 GB thr.)
+        ("SAT", (8192, 8192), 200),  # dense excl. IoU/NMS  (13.4 GB > 1 GB thr.)
+    ]
+    FILL_FRACTIONS = [0.05, 0.20, 0.50]  # sparse / moderate / SAM-everything
+    VERTEX_COUNTS = [8, 128, 600]  # low / realistic / YOLOv8-seg default
+
+    scenarios = [
+        {
+            "name": f"{tier}-{num_objects}-{fill_fraction:.0%}-v{num_vertices}",
+            "num_objects": num_objects,
+            "image_height": img_h,
+            "image_width": img_w,
+            "fill_fraction": fill_fraction,
+            "num_vertices": num_vertices,
+        }
+        for tier, (img_w, img_h), num_objects in TIERS
+        for fill_fraction in FILL_FRACTIONS
+        for num_vertices in VERTEX_COUNTS
+    ]
+
+    timestamp = datetime.now(timezone.utc).strftime("%Y%m%d_%H%M%S")
+    results_path = Path(__file__).parent / f"results_{timestamp}.jsonl"
+
+    console.print(
+        f"[bold]supervision[/bold]"
+        f" {sv.__version__}  ·  numpy {np.__version__}  ·  {len(scenarios)} scenarios"
+        f"  ·  saving to [dim]{results_path.name}[/dim]"
+    )
+
+    results = []
+    progress = Progress(
+        TextColumn("[progress.description]{task.description}"),
+        BarColumn(),
+        MofNCompleteColumn(),
+        TaskProgressColumn(),
+        TimeElapsedColumn(),
+        console=console,
+    )
+    with progress:
+        task = progress.add_task("benchmarking…", total=len(scenarios))
+        for params in scenarios:
+            progress.update(task, description=f"[bold]{params['name']}[/bold]")
+            result = run_scenario(**params)
+            results.append(result)
+            _append_result(result, results_path)
+            gc.collect()  # flush scenario temporaries before next run
+            progress.advance(task)
+
+    print_summary(results)
+
+    csv_path = results_path.with_suffix(".csv")
+    save_results_csv(results, csv_path)
+    console.print(f"[dim]results saved → {results_path.name}  ·  {csv_path.name}[/dim]")
+
+
+if __name__ == "__main__":
+    main()
diff --git a/examples/time_in_zone/README.md b/examples/time_in_zone/README.md
index cb24e6969f..54cc44bd69 100644
--- a/examples/time_in_zone/README.md
+++ b/examples/time_in_zone/README.md
@@ -222,7 +222,7 @@ Script to run object detection on an RTSP stream using the RF-DETR model.
 - `--model_size`: RF-DETR backbone size to load — choose from 'nano', 'small', 'medium', 'base', or 'large' (default 'medium').
 - `--device`: Compute device to run the model on ('cpu', 'mps', or 'cuda'; default 'cpu').
 - `--classes`: Space-separated list of class IDs to track. Leave empty to track all classes.
-- `--confidence_threshold`: Minimum confidence score for a detection to be kept, range 0–1 (default 0.3).
+- `--confidence_threshold`: Minimum confidence score for a detection to be kept, range 0-1 (default 0.3).
 - `--iou_threshold`: IOU threshold applied during non-max suppression (default 0.7).
 - `--resolution`: Shortest-side input resolution supplied to the model. The script will round it to the nearest valid multiple (default 640).
 
diff --git a/src/supervision/__init__.py b/src/supervision/__init__.py
index 1bda28164d..8b56f597fd 100644
--- a/src/supervision/__init__.py
+++ b/src/supervision/__init__.py
@@ -45,6 +45,7 @@
 )
 from supervision.dataset.formats.coco import get_coco_class_index_mapping
 from supervision.dataset.utils import mask_to_rle, rle_to_mask
+from supervision.detection.compact_mask import CompactMask
 from supervision.detection.core import Detections
 from supervision.detection.line_zone import (
     LineZone,
@@ -161,6 +162,7 @@
     "ColorAnnotator",
     "ColorLookup",
     "ColorPalette",
+    "CompactMask",
     "ComparisonAnnotator",
     "ConfusionMatrix",
     "CropAnnotator",
diff --git a/src/supervision/annotators/core.py b/src/supervision/annotators/core.py
index a2f729b31b..a579551415 100644
--- a/src/supervision/annotators/core.py
+++ b/src/supervision/annotators/core.py
@@ -434,6 +434,11 @@ def annotate(
 
         colored_mask = np.array(scene, copy=True, dtype=np.uint8)
 
+        from supervision.detection.compact_mask import CompactMask
+
+        compact_mask = (
+            detections.mask if isinstance(detections.mask, CompactMask) else None
+        )
         for detection_idx in np.flip(np.argsort(detections.area)):
             color = resolve_color(
                 color=self.color,
@@ -443,8 +448,21 @@ def annotate(
                 if custom_color_lookup is None
                 else custom_color_lookup,
             )
-            mask = np.asarray(detections.mask[detection_idx], dtype=bool)
-            colored_mask[mask] = color.as_bgr()
+            if compact_mask is not None:
+                # Paint only the bounding-box crop — avoids a full (H, W) alloc.
+                x1 = int(compact_mask.offsets[detection_idx, 0])
+                y1 = int(compact_mask.offsets[detection_idx, 1])
+                crop_m = compact_mask.crop(detection_idx)
+                crop_h, crop_w = crop_m.shape
+                colored_mask[y1 : y1 + crop_h, x1 : x1 + crop_w][crop_m] = (
+                    color.as_bgr()
+                )
+            else:
+                mask = np.asarray(
+                    detections.mask[detection_idx],
+                    dtype=bool,
+                )
+                colored_mask[mask] = color.as_bgr()
 
         cv2.addWeighted(
             colored_mask, self.opacity, scene, 1 - self.opacity, 0, dst=scene
@@ -2900,8 +2918,8 @@ def annotate(self, scene: ImageType, detections: Detections) -> ImageType:
                 colored_mask[y1:y2, x1:x2] = scene[y1:y2, x1:x2]
         else:
             for mask in detections.mask:
-                mask = np.asarray(mask, dtype=bool)
-                colored_mask[mask] = scene[mask]
+                mask_bool = np.asarray(mask, dtype=bool)
+                colored_mask[mask_bool] = scene[mask_bool]
 
         np.copyto(scene, colored_mask)
         return scene
diff --git a/src/supervision/detection/compact_mask.py b/src/supervision/detection/compact_mask.py
new file mode 100644
index 0000000000..32135212d7
--- /dev/null
+++ b/src/supervision/detection/compact_mask.py
@@ -0,0 +1,905 @@
+"""Crop-RLE compact mask storage for memory-efficient instance segmentation.
+
+Dense ``(N, H, W)`` boolean masks use O(N·H·W) memory, which becomes
+prohibitive for aerial imagery (e.g. 1000 objects x 4K image ~ 8.3 GB).
+:class:`CompactMask` stores each mask as a run-length encoding of its
+bounding-box crop, reducing typical usage to tens of MB.
+
+The bounding boxes (``xyxy``) already present in ``Detections`` serve as the
+crop boundaries, so no extra metadata is required from the caller.
+"""
+
+from __future__ import annotations
+
+from collections.abc import Iterator
+from typing import Any, cast
+
+import numpy as np
+import numpy.typing as npt
+
+
+def _rle_encode(mask_2d: npt.NDArray[Any]) -> npt.NDArray[np.int32]:
+    """Run-length encode a 2D boolean mask in row-major order.
+
+    The encoding starts with the count of leading ``False`` values (may be 0
+    if the mask begins with ``True``).  Subsequent values alternate between
+    ``True`` and ``False`` run counts.
+
+    Args:
+        mask_2d: 2D boolean array of shape ``(H, W)``.
+
+    Returns:
+        int32 array of run lengths, starting with the False count.
+
+    Examples:
+        ```pycon
+        >>> import numpy as np
+        >>> from supervision.detection.compact_mask import _rle_encode
+        >>> mask = np.array([[False, True, True], [True, False, False]])
+        >>> _rle_encode(mask).tolist()
+        [1, 3, 2]
+
+        ```
+    """
+    flat = mask_2d.ravel()  # C-order (row-major)
+    if len(flat) == 0:
+        return np.array([0], dtype=np.int32)
+
+    # Locate positions where the boolean value changes.
+    changes = np.diff(flat.view(np.uint8))
+    boundaries = np.where(changes != 0)[0] + 1
+
+    positions = np.concatenate(([0], boundaries, [len(flat)]))
+    run_lengths = np.diff(positions).astype(np.int32)
+
+    # Guarantee the encoding always starts with a False count.
+    if flat[0]:
+        run_lengths = np.concatenate(([np.int32(0)], run_lengths))
+
+    return run_lengths
+
+
+def _rle_decode(
+    rle: npt.NDArray[np.int32], height: int, width: int
+) -> npt.NDArray[np.bool_]:
+    """Decode a run-length encoded mask back to a 2D boolean array.
+
+    Args:
+        rle: int32 array of run lengths as produced by :func:`_rle_encode`.
+        height: Height of the output array.
+        width: Width of the output array.
+
+    Returns:
+        2D boolean array of shape ``(height, width)``.
+
+    Examples:
+        ```pycon
+        >>> import numpy as np
+        >>> from supervision.detection.compact_mask import _rle_decode
+        >>> rle = np.array([1, 3, 2], dtype=np.int32)
+        >>> _rle_decode(rle, 2, 3)
+        array([[False,  True,  True],
+               [ True, False, False]])
+
+        ```
+    """
+    # Even-indexed entries → False runs; odd-indexed entries → True runs.
+    is_true = np.arange(len(rle)) % 2 == 1
+    flat: npt.NDArray[np.bool_] = np.repeat(is_true, rle)
+    num_pixels = height * width
+    if len(flat) < num_pixels:
+        # Pad with False if the RLE is shorter than expected (e.g. all-False
+        # tails are often omitted during encoding).
+        flat = np.pad(flat, (0, num_pixels - len(flat)))
+    return cast(npt.NDArray[np.bool_], flat[:num_pixels].reshape(height, width))
+
+
+def _rle_area(rle: npt.NDArray[np.int32]) -> int:
+    """Return the number of ``True`` pixels in a run-length encoded mask.
+
+    Args:
+        rle: int32 array of run lengths as produced by :func:`_rle_encode`.
+
+    Returns:
+        Total number of ``True`` pixels.
+
+    Examples:
+        ```pycon
+        >>> import numpy as np
+        >>> from supervision.detection.compact_mask import _rle_area
+        >>> rle = np.array([1, 3, 2], dtype=np.int32)  # 1 F, 3 T, 2 F
+        >>> _rle_area(rle)
+        3
+
+        ```
+    """
+    return int(np.sum(rle[1::2]))
+
+
+class CompactMask:
+    """Memory-efficient crop-RLE mask storage for instance segmentation.
+
+    Instead of storing N full ``(H, W)`` boolean arrays, :class:`CompactMask`
+    encodes each mask as a run-length sequence of its bounding-box crop.  This
+    reduces memory from O(N·H·W) to roughly O(N·bbox_area), which is orders of
+    magnitude smaller for sparse masks on high-resolution images.
+
+    The class exposes a duck-typed interface compatible with ``np.ndarray``
+    masks used elsewhere in ``supervision``:
+
+    * ``mask[int]`` → dense ``(H, W)`` bool array (annotators, converters).
+    * ``mask[slice | list | ndarray]`` → new :class:`CompactMask` (filtering).
+    * ``np.asarray(mask)`` → dense ``(N, H, W)`` bool array (numpy interop).
+    * ``mask.shape``, ``mask.dtype``, ``mask.area`` — match the dense API.
+
+    :class:`CompactMask` is **not** a drop-in ``np.ndarray`` replacement.
+    When you need to call arbitrary ndarray methods (``astype``, ``reshape``,
+    ``ravel``, ``any``, ``all``, …) call :meth:`to_dense` first:
+    ``cm.to_dense().astype(np.uint8)``.  :meth:`to_dense` is the single
+    explicit materialisation boundary.
+
+    .. note:: **RLE encoding incompatibility with pycocotools / COCO API**
+
+        :class:`CompactMask` uses **row-major (C-order)** run-lengths scoped
+        to each mask's bounding-box crop.  The COCO API (pycocotools) uses
+        **column-major (Fortran-order)** run-lengths scoped to the **full
+        image**.  The two formats are not interchangeable: you cannot pass a
+        :class:`CompactMask` RLE directly to ``maskUtils.iou()`` or
+        ``maskUtils.decode()``, and you cannot load a COCO RLE dict into a
+        :class:`CompactMask` without re-encoding.  Use
+        :meth:`to_dense` to obtain a standard boolean array, then pass it to
+        pycocotools if needed.
+
+    Args:
+        rles: List of N int32 run-length arrays.
+        crop_shapes: Array of shape ``(N, 2)`` — ``(crop_h, crop_w)`` per mask.
+        offsets: Array of shape ``(N, 2)`` — ``(x1, y1)`` bounding-box origins.
+        image_shape: ``(H, W)`` of the full image.
+
+    Examples:
+        ```pycon
+        >>> import numpy as np
+        >>> from supervision.detection.compact_mask import CompactMask
+        >>> masks = np.zeros((2, 100, 100), dtype=bool)
+        >>> masks[0, 10:20, 10:20] = True
+        >>> masks[1, 50:70, 50:80] = True
+        >>> xyxy = np.array([[10, 10, 19, 19], [50, 50, 79, 69]], dtype=np.float32)
+        >>> cm = CompactMask.from_dense(masks, xyxy, image_shape=(100, 100))
+        >>> len(cm)
+        2
+        >>> cm.shape
+        (2, 100, 100)
+
+        ```
+    """
+
+    __slots__ = ("_crop_shapes", "_image_shape", "_offsets", "_rles")
+
+    def __init__(
+        self,
+        rles: list[npt.NDArray[np.int32]],
+        crop_shapes: npt.NDArray[np.int32],
+        offsets: npt.NDArray[np.int32],
+        image_shape: tuple[int, int],
+    ) -> None:
+        self._rles: list[npt.NDArray[np.int32]] = rles
+        self._crop_shapes: npt.NDArray[np.int32] = crop_shapes  # (N,2): (h,w)
+        self._offsets: npt.NDArray[np.int32] = offsets  # (N,2): (x1,y1)
+        self._image_shape: tuple[int, int] = image_shape  # (H, W)
+
+    # ------------------------------------------------------------------
+    # Construction
+    # ------------------------------------------------------------------
+
+    @classmethod
+    def from_dense(
+        cls,
+        masks: npt.NDArray[np.bool_],
+        xyxy: npt.NDArray[Any],
+        image_shape: tuple[int, int],
+    ) -> CompactMask:
+        """Create a :class:`CompactMask` from a dense ``(N, H, W)`` bool array.
+
+        Bounding boxes are clipped to image bounds and interpreted in the
+        supervision ``xyxy`` convention (inclusive max coordinates). A
+        box with invalid ordering (``x2 < x1`` or ``y2 < y1``) is replaced by
+        a ``1x1`` all-False crop to avoid degenerate RLE.
+
+        Args:
+            masks: Dense boolean mask array of shape ``(N, H, W)``.
+            xyxy: Bounding boxes of shape ``(N, 4)`` in ``[x1, y1, x2, y2]``
+                format.
+            image_shape: ``(H, W)`` of the full image.
+
+        Returns:
+            A new :class:`CompactMask` instance.
+
+        Examples:
+            ```pycon
+            >>> import numpy as np
+            >>> from supervision.detection.compact_mask import CompactMask
+            >>> masks = np.zeros((1, 100, 100), dtype=bool)
+            >>> masks[0, 10:20, 10:20] = True
+            >>> xyxy = np.array([[10, 10, 19, 19]], dtype=np.float32)
+            >>> cm = CompactMask.from_dense(masks, xyxy, image_shape=(100, 100))
+            >>> cm.shape
+            (1, 100, 100)
+
+            ```
+        """
+        img_h, img_w = image_shape
+        num_masks = len(masks)
+
+        if num_masks == 0:
+            return cls(
+                [],
+                np.empty((0, 2), dtype=np.int32),
+                np.empty((0, 2), dtype=np.int32),
+                image_shape,
+            )
+
+        rles: list[npt.NDArray[np.int32]] = []
+        crop_shapes_list: list[tuple[int, int]] = []
+        offsets_list: list[tuple[int, int]] = []
+
+        for mask_idx in range(num_masks):
+            x1, y1, x2, y2 = xyxy[mask_idx]
+            x1c = int(max(0, min(int(x1), img_w - 1)))
+            y1c = int(max(0, min(int(y1), img_h - 1)))
+            x2c = int(max(0, min(int(x2), img_w - 1)))
+            y2c = int(max(0, min(int(y2), img_h - 1)))
+            crop: npt.NDArray[np.bool_]
+
+            # supervision xyxy uses inclusive max coords, so slicing must add +1.
+            if x2c < x1c or y2c < y1c:
+                crop = np.zeros((1, 1), dtype=bool)
+                x2c, y2c = x1c, y1c
+            else:
+                crop = masks[mask_idx, y1c : y2c + 1, x1c : x2c + 1]
+
+            crop_h = y2c - y1c + 1
+            crop_w = x2c - x1c + 1
+            rles.append(_rle_encode(crop))
+            crop_shapes_list.append((crop_h, crop_w))
+            offsets_list.append((x1c, y1c))
+
+        crop_shapes = np.array(crop_shapes_list, dtype=np.int32)
+        offsets = np.array(offsets_list, dtype=np.int32)
+        return cls(rles, crop_shapes, offsets, image_shape)
+
+    # ------------------------------------------------------------------
+    # Materialisation
+    # ------------------------------------------------------------------
+
+    def to_dense(self) -> npt.NDArray[np.bool_]:
+        """Materialise all masks as a dense ``(N, H, W)`` boolean array.
+
+        Returns:
+            Boolean array of shape ``(N, H, W)``.
+
+        Examples:
+            ```pycon
+            >>> import numpy as np
+            >>> from supervision.detection.compact_mask import CompactMask
+            >>> masks = np.zeros((1, 50, 50), dtype=bool)
+            >>> masks[0, 10:20, 10:30] = True
+            >>> xyxy = np.array([[10, 10, 29, 19]], dtype=np.float32)
+            >>> cm = CompactMask.from_dense(masks, xyxy, image_shape=(50, 50))
+            >>> cm.to_dense().shape
+            (1, 50, 50)
+
+            ```
+        """
+        num_masks = len(self._rles)
+        img_h, img_w = self._image_shape
+        result: npt.NDArray[np.bool_] = np.zeros((num_masks, img_h, img_w), dtype=bool)
+        for mask_idx in range(num_masks):
+            crop_h, crop_w = (
+                int(self._crop_shapes[mask_idx, 0]),
+                int(self._crop_shapes[mask_idx, 1]),
+            )
+            x1, y1 = int(self._offsets[mask_idx, 0]), int(self._offsets[mask_idx, 1])
+            crop = _rle_decode(self._rles[mask_idx], crop_h, crop_w)
+            result[mask_idx, y1 : y1 + crop_h, x1 : x1 + crop_w] = crop
+        return result
+
+    def crop(self, index: int) -> npt.NDArray[np.bool_]:
+        """Decode a single mask crop without allocating the full image array.
+
+        This is an O(crop_area) operation — ideal for annotators that only
+        need the cropped region.
+
+        Args:
+            index: Index of the mask to decode.
+
+        Returns:
+            Boolean array of shape ``(crop_h, crop_w)``.
+
+        Examples:
+            ```pycon
+            >>> import numpy as np
+            >>> from supervision.detection.compact_mask import CompactMask
+            >>> masks = np.zeros((1, 100, 100), dtype=bool)
+            >>> masks[0, 20:30, 10:40] = True
+            >>> xyxy = np.array([[10, 20, 39, 29]], dtype=np.float32)
+            >>> cm = CompactMask.from_dense(masks, xyxy, image_shape=(100, 100))
+            >>> cm.crop(0).shape
+            (10, 30)
+
+            ```
+        """
+        crop_h = int(self._crop_shapes[index, 0])
+        crop_w = int(self._crop_shapes[index, 1])
+        return _rle_decode(self._rles[index], crop_h, crop_w)
+
+    # ------------------------------------------------------------------
+    # Sequence / array protocol
+    # ------------------------------------------------------------------
+
+    def __len__(self) -> int:
+        """Return the number of masks.
+
+        Returns:
+            Number of masks N.
+
+        Examples:
+            ```pycon
+            >>> from supervision.detection.compact_mask import CompactMask
+            >>> import numpy as np
+            >>> cm = CompactMask(
+            ...     [], np.empty((0, 2), dtype=np.int32),
+            ...     np.empty((0, 2), dtype=np.int32), (100, 100))
+            >>> len(cm)
+            0
+
+            ```
+        """
+        return len(self._rles)
+
+    def __iter__(self) -> Iterator[npt.NDArray[np.bool_]]:
+        """Iterate over masks as dense ``(H, W)`` boolean arrays."""
+        for mask_idx in range(len(self)):
+            yield self[mask_idx]
+
+    @property
+    def shape(self) -> tuple[int, int, int]:
+        """Return ``(N, H, W)`` matching the dense mask convention.
+
+        Returns:
+            Tuple ``(N, H, W)``.
+
+        Examples:
+            ```pycon
+            >>> from supervision.detection.compact_mask import CompactMask
+            >>> import numpy as np
+            >>> cm = CompactMask(
+            ...     [], np.empty((0, 2), dtype=np.int32),
+            ...     np.empty((0, 2), dtype=np.int32), (480, 640))
+            >>> cm.shape
+            (0, 480, 640)
+
+            ```
+        """
+        img_h, img_w = self._image_shape
+        return (len(self), img_h, img_w)
+
+    @property
+    def offsets(self) -> npt.NDArray[np.int32]:
+        """Return per-mask crop origins as ``(x1, y1)`` integer offsets.
+
+        Returns:
+            Array of shape ``(N, 2)`` with ``int32`` offsets.
+
+        Examples:
+            ```pycon
+            >>> import numpy as np
+            >>> from supervision.detection.compact_mask import CompactMask
+            >>> masks = np.zeros((1, 10, 10), dtype=bool)
+            >>> masks[0, 2:4, 3:5] = True
+            >>> xyxy = np.array([[3, 2, 4, 3]], dtype=np.float32)
+            >>> cm = CompactMask.from_dense(masks, xyxy, image_shape=(10, 10))
+            >>> cm.offsets.tolist()
+            [[3, 2]]
+
+            ```
+        """
+        return self._offsets
+
+    @property
+    def bbox_xyxy(self) -> npt.NDArray[np.int32]:
+        """Return per-mask inclusive bounding boxes in ``xyxy`` format.
+
+        Boxes are derived from crop metadata:
+        ``x2 = x1 + crop_w - 1``, ``y2 = y1 + crop_h - 1``.
+
+        Returns:
+            Array of shape ``(N, 4)`` with ``int32`` boxes
+            ``[x1, y1, x2, y2]``.
+
+        Examples:
+            ```pycon
+            >>> import numpy as np
+            >>> from supervision.detection.compact_mask import CompactMask
+            >>> masks = np.zeros((1, 10, 10), dtype=bool)
+            >>> masks[0, 2:5, 3:7] = True
+            >>> xyxy = np.array([[3, 2, 6, 4]], dtype=np.float32)
+            >>> cm = CompactMask.from_dense(masks, xyxy, image_shape=(10, 10))
+            >>> cm.bbox_xyxy.tolist()
+            [[3, 2, 6, 4]]
+
+            ```
+        """
+        if len(self) == 0:
+            return np.empty((0, 4), dtype=np.int32)
+
+        x1: npt.NDArray[np.int32] = self._offsets[:, 0]
+        y1: npt.NDArray[np.int32] = self._offsets[:, 1]
+        x2: npt.NDArray[np.int32] = x1 + self._crop_shapes[:, 1] - 1
+        y2: npt.NDArray[np.int32] = y1 + self._crop_shapes[:, 0] - 1
+        return np.column_stack((x1, y1, x2, y2)).astype(np.int32, copy=False)
+
+    @property
+    def dtype(self) -> np.dtype[Any]:
+        """Return ``np.dtype(bool)`` — always.
+
+        Returns:
+            ``np.dtype(bool)``.
+
+        Examples:
+            ```pycon
+            >>> from supervision.detection.compact_mask import CompactMask
+            >>> import numpy as np
+            >>> cm = CompactMask(
+            ...     [], np.empty((0, 2), dtype=np.int32),
+            ...     np.empty((0, 2), dtype=np.int32), (100, 100))
+            >>> cm.dtype
+            dtype('bool')
+
+            ```
+        """
+        return np.dtype(bool)
+
+    @property
+    def area(self) -> npt.NDArray[np.int64]:
+        """Compute the area (``True`` pixel count) of each mask.
+
+        Returns:
+            int64 array of shape ``(N,)`` with per-mask pixel counts.
+
+        Examples:
+            ```pycon
+            >>> import numpy as np
+            >>> from supervision.detection.compact_mask import CompactMask
+            >>> masks = np.zeros((2, 100, 100), dtype=bool)
+            >>> masks[0, 0:10, 0:10] = True  # 100 pixels
+            >>> masks[1, 0:5, 0:5] = True    # 25 pixels
+            >>> xyxy = np.array([[0, 0, 9, 9], [0, 0, 4, 4]], dtype=np.float32)
+            >>> cm = CompactMask.from_dense(masks, xyxy, image_shape=(100, 100))
+            >>> cm.area.tolist()
+            [100, 25]
+
+            ```
+        """
+        return np.array([_rle_area(rle) for rle in self._rles], dtype=np.int64)
+
+    def sum(self, axis: int | tuple[int, ...] | None = None) -> npt.NDArray[Any] | int:
+        """NumPy-compatible sum with a fast path for per-mask area.
+
+        When ``axis=(1, 2)``, returns the per-mask True-pixel count via
+        :attr:`area` without materialising the full dense array.
+
+        Args:
+            axis: Axis or axes to sum over.
+
+        Returns:
+            Sum result matching NumPy semantics.
+
+        Examples:
+            ```pycon
+            >>> import numpy as np
+            >>> from supervision.detection.compact_mask import CompactMask
+            >>> masks = np.zeros((1, 10, 10), dtype=bool)
+            >>> masks[0, 0:3, 0:3] = True
+            >>> xyxy = np.array([[0, 0, 2, 2]], dtype=np.float32)
+            >>> cm = CompactMask.from_dense(masks, xyxy, image_shape=(10, 10))
+            >>> cm.sum(axis=(1, 2)).tolist()
+            [9]
+
+            ```
+        """
+        if axis == (1, 2):
+            return self.area
+        return self.to_dense().sum(axis=axis)
+
+    def __getitem__(
+        self,
+        index: int | slice | list[Any] | npt.NDArray[Any],
+    ) -> npt.NDArray[np.bool_] | CompactMask:
+        """Index into the mask collection.
+
+        * ``int`` → dense ``(H, W)`` bool array (for annotators, iterators).
+        * ``slice | list | ndarray`` → new :class:`CompactMask` (for filtering).
+
+        Args:
+            index: An integer returns a dense ``(H, W)`` mask.  Any other
+                supported index type returns a new :class:`CompactMask`.
+
+        Returns:
+            Dense ``(H, W)`` ``np.ndarray`` for integer index, or a new
+            :class:`CompactMask` for all other index types.
+
+        Examples:
+            ```pycon
+            >>> import numpy as np
+            >>> from supervision.detection.compact_mask import CompactMask
+            >>> masks = np.zeros((3, 20, 20), dtype=bool)
+            >>> xyxy = np.array(
+            ...     [[0,0,5,5],[5,5,10,10],[10,10,15,15]], dtype=np.float32)
+            >>> cm = CompactMask.from_dense(masks, xyxy, image_shape=(20, 20))
+            >>> cm[0].shape        # int → dense (H, W)
+            (20, 20)
+            >>> len(cm[[0, 2]])    # list → CompactMask
+            2
+
+            ```
+        """
+        if isinstance(index, (int, np.integer)):
+            idx = int(index)
+            img_h, img_w = self._image_shape
+            result: npt.NDArray[np.bool_] = np.zeros((img_h, img_w), dtype=bool)
+            crop_h = int(self._crop_shapes[idx, 0])
+            crop_w = int(self._crop_shapes[idx, 1])
+            x1 = int(self._offsets[idx, 0])
+            y1 = int(self._offsets[idx, 1])
+            crop = _rle_decode(self._rles[idx], crop_h, crop_w)
+            result[y1 : y1 + crop_h, x1 : x1 + crop_w] = crop
+            return result
+
+        # Slice: use direct Python list slice and numpy view — O(k), no arange.
+        if isinstance(index, slice):
+            return CompactMask(
+                self._rles[index],
+                self._crop_shapes[index],
+                self._offsets[index],
+                self._image_shape,
+            )
+
+        # Boolean selectors and fancy index → convert to integer positions first.
+        if isinstance(index, np.ndarray) and index.dtype == bool:
+            idx_arr = np.where(index)[0]
+        elif isinstance(index, list) and all(
+            isinstance(item, (bool, np.bool_)) for item in index
+        ):
+            idx_arr = np.flatnonzero(np.asarray(index, dtype=bool))
+        else:
+            idx_arr = np.asarray(list(index), dtype=np.intp)
+
+        new_rles = [self._rles[int(mask_idx)] for mask_idx in idx_arr]
+        new_crop_shapes: npt.NDArray[np.int32] = self._crop_shapes[idx_arr]
+        new_offsets: npt.NDArray[np.int32] = self._offsets[idx_arr]
+        return CompactMask(new_rles, new_crop_shapes, new_offsets, self._image_shape)
+
+    def __array__(self, dtype: np.dtype[Any] | None = None) -> npt.NDArray[Any]:
+        """NumPy interop: materialise as a dense ``(N, H, W)`` array.
+
+        Called by ``np.asarray(compact_mask)`` and similar NumPy functions.
+
+        Args:
+            dtype: Optional dtype to cast the result to.
+
+        Returns:
+            Dense boolean array of shape ``(N, H, W)``.
+
+        Examples:
+            ```pycon
+            >>> import numpy as np
+            >>> from supervision.detection.compact_mask import CompactMask
+            >>> masks = np.zeros((1, 10, 10), dtype=bool)
+            >>> xyxy = np.array([[0, 0, 5, 5]], dtype=np.float32)
+            >>> cm = CompactMask.from_dense(masks, xyxy, image_shape=(10, 10))
+            >>> np.asarray(cm).shape
+            (1, 10, 10)
+
+            ```
+        """
+        result = self.to_dense()
+        if dtype is not None:
+            return result.astype(dtype)
+        return result
+
+    def __eq__(self, other: object) -> bool:
+        """Element-wise equality with another :class:`CompactMask` or ndarray.
+
+        Args:
+            other: Another :class:`CompactMask` or ``np.ndarray``.
+
+        Returns:
+            ``True`` if all masks are pixel-identical.
+
+        Examples:
+            ```pycon
+            >>> import numpy as np
+            >>> from supervision.detection.compact_mask import CompactMask
+            >>> masks = np.zeros((1, 10, 10), dtype=bool)
+            >>> xyxy = np.array([[0, 0, 5, 5]], dtype=np.float32)
+            >>> cm1 = CompactMask.from_dense(masks, xyxy, image_shape=(10, 10))
+            >>> cm2 = CompactMask.from_dense(masks, xyxy, image_shape=(10, 10))
+            >>> cm1 == cm2
+            True
+
+            ```
+        """
+        if isinstance(other, CompactMask):
+            return bool(np.array_equal(self.to_dense(), other.to_dense()))
+        if isinstance(other, np.ndarray):
+            return bool(np.array_equal(self.to_dense(), other))
+        return NotImplemented
+
+    # ------------------------------------------------------------------
+    # Collection utilities
+    # ------------------------------------------------------------------
+
+    @staticmethod
+    def merge(masks_list: list[CompactMask]) -> CompactMask:
+        """Concatenate multiple :class:`CompactMask` objects into one.
+
+        All inputs must have the same ``image_shape``.
+
+        Args:
+            masks_list: Non-empty list of :class:`CompactMask` objects.
+
+        Returns:
+            A new :class:`CompactMask` containing every mask from the inputs,
+            in order.
+
+        Raises:
+            ValueError: If ``masks_list`` is empty or image shapes differ.
+
+        Examples:
+            ```pycon
+            >>> import numpy as np
+            >>> from supervision.detection.compact_mask import CompactMask
+            >>> masks1 = np.zeros((2, 50, 50), dtype=bool)
+            >>> masks2 = np.zeros((3, 50, 50), dtype=bool)
+            >>> xyxy1 = np.array([[0,0,10,10],[10,10,20,20]], dtype=np.float32)
+            >>> xyxy2 = np.array(
+            ...     [[0,0,5,5],[5,5,10,10],[10,10,15,15]], dtype=np.float32)
+            >>> cm1 = CompactMask.from_dense(masks1, xyxy1, image_shape=(50, 50))
+            >>> cm2 = CompactMask.from_dense(masks2, xyxy2, image_shape=(50, 50))
+            >>> len(CompactMask.merge([cm1, cm2]))
+            5
+
+            ```
+        """
+        if not masks_list:
+            raise ValueError("Cannot merge an empty list of CompactMask objects.")
+
+        image_shape = masks_list[0]._image_shape
+        for cm in masks_list[1:]:
+            if cm._image_shape != image_shape:
+                raise ValueError(
+                    f"Cannot merge CompactMask objects with different image shapes: "
+                    f"{image_shape} vs {cm._image_shape}"
+                )
+
+        # list.extend is a C-level call and avoids the per-element Python
+        # bytecode overhead of a flat list comprehension.  This matters under
+        # GIL contention when multiple threads call merge concurrently.
+        new_rles: list[npt.NDArray[np.int32]] = []
+        for cm in masks_list:
+            new_rles.extend(cm._rles)
+
+        # np.concatenate handles (0, 2) arrays correctly.
+        # No .astype() needed — _crop_shapes and _offsets are already int32.
+        new_crop_shapes: npt.NDArray[np.int32] = np.concatenate(
+            [cm._crop_shapes for cm in masks_list], axis=0
+        )
+        new_offsets: npt.NDArray[np.int32] = np.concatenate(
+            [cm._offsets for cm in masks_list], axis=0
+        )
+
+        return CompactMask(new_rles, new_crop_shapes, new_offsets, image_shape)
+
+    def repack(self) -> CompactMask:
+        """Re-encode all masks using tight bounding boxes.
+
+        When the original ``xyxy`` boxes are padded or loose — common with
+        object-detector outputs and full-image boxes used in tests — each RLE
+        crop encodes more background (``False``) pixels than necessary.  This
+        method decodes every crop, trims it to the minimal rectangle that
+        contains all ``True`` pixels, and re-encodes.  All-``False`` masks are
+        normalised to a ``1x1`` all-``False`` crop.
+
+        The call is O(sum of crop areas) — suitable as a one-time cleanup
+        after accumulating many merges (e.g. after
+        :class:`~supervision.detection.tools.inference_slicer.InferenceSlicer`
+        tiles are merged).
+
+        Returns:
+            A new :class:`CompactMask` with minimal-area crops and updated
+            offsets.
+
+        Examples:
+            ```pycon
+            >>> import numpy as np
+            >>> from supervision.detection.compact_mask import CompactMask
+            >>> masks = np.zeros((1, 10, 10), dtype=bool)
+            >>> masks[0, 3:7, 3:7] = True
+            >>> # Deliberately loose bbox: covers the full image.
+            >>> xyxy = np.array([[0, 0, 9, 9]], dtype=np.float32)
+            >>> cm = CompactMask.from_dense(masks, xyxy, image_shape=(10, 10))
+            >>> repacked = cm.repack()
+            >>> repacked.offsets.tolist()  # tight origin: x1=3, y1=3
+            [[3, 3]]
+
+            ```
+        """
+        num_masks = len(self._rles)
+        if num_masks == 0:
+            return CompactMask(
+                [],
+                np.empty((0, 2), dtype=np.int32),
+                np.empty((0, 2), dtype=np.int32),
+                self._image_shape,
+            )
+
+        new_rles: list[npt.NDArray[np.int32]] = []
+        new_crop_shapes_list: list[tuple[int, int]] = []
+        new_offsets_list: list[tuple[int, int]] = []
+
+        for mask_idx in range(num_masks):
+            crop = self.crop(mask_idx)
+            x1_off = int(self._offsets[mask_idx, 0])
+            y1_off = int(self._offsets[mask_idx, 1])
+
+            rows_any = np.any(crop, axis=1)
+            cols_any = np.any(crop, axis=0)
+
+            if not rows_any.any():
+                # All-False: normalise to 1x1 to avoid zero-sized arrays.
+                new_rles.append(_rle_encode(np.zeros((1, 1), dtype=bool)))
+                new_crop_shapes_list.append((1, 1))
+                new_offsets_list.append((x1_off, y1_off))
+                continue
+
+            y_indices = np.where(rows_any)[0]
+            x_indices = np.where(cols_any)[0]
+            y_min, y_max = int(y_indices[0]), int(y_indices[-1])
+            x_min, x_max = int(x_indices[0]), int(x_indices[-1])
+
+            tight = crop[y_min : y_max + 1, x_min : x_max + 1]
+            new_rles.append(_rle_encode(tight))
+            new_crop_shapes_list.append((y_max - y_min + 1, x_max - x_min + 1))
+            new_offsets_list.append((x1_off + x_min, y1_off + y_min))
+
+        return CompactMask(
+            new_rles,
+            np.array(new_crop_shapes_list, dtype=np.int32),
+            np.array(new_offsets_list, dtype=np.int32),
+            self._image_shape,
+        )
+
+    # ------------------------------------------------------------------
+    # Slicer support
+    # ------------------------------------------------------------------
+
+    def with_offset(
+        self,
+        dx: int,
+        dy: int,
+        new_image_shape: tuple[int, int],
+    ) -> CompactMask:
+        """Return a new :class:`CompactMask` with adjusted offsets and image shape.
+
+        Used by :class:`~supervision.detection.tools.inference_slicer.InferenceSlicer`
+        to relocate tile-local masks into full-image coordinates without
+        materialising the dense ``(N, H, W)`` array.
+
+        Args:
+            dx: Pixels to add to every mask's ``x1`` offset.
+            dy: Pixels to add to every mask's ``y1`` offset.
+            new_image_shape: ``(H, W)`` of the full (destination) image.
+
+        Returns:
+            New :class:`CompactMask` with updated offsets and image shape.
+            Crops are clipped to stay inside ``new_image_shape``; masks fully
+            outside are represented as ``1x1`` all-False crops.
+
+        Examples:
+            ```pycon
+            >>> import numpy as np
+            >>> from supervision.detection.compact_mask import CompactMask
+            >>> masks = np.zeros((1, 20, 20), dtype=bool)
+            >>> xyxy = np.array([[5, 5, 15, 15]], dtype=np.float32)
+            >>> cm = CompactMask.from_dense(masks, xyxy, image_shape=(20, 20))
+            >>> cm2 = cm.with_offset(100, 200, new_image_shape=(400, 400))
+            >>> cm2.offsets[0].tolist()
+            [105, 205]
+
+            ```
+        """
+        new_h, new_w = new_image_shape
+        if new_h <= 0 or new_w <= 0:
+            raise ValueError("new_image_shape must contain positive dimensions")
+
+        num_masks = len(self)
+        if num_masks == 0:
+            return CompactMask(
+                [],
+                np.empty((0, 2), dtype=np.int32),
+                np.empty((0, 2), dtype=np.int32),
+                new_image_shape,
+            )
+
+        # Vectorised bounds check: compute every new [x1,y1,x2,y2] at once.
+        # For the common case (InferenceSlicer tiles that fit fully inside the
+        # new canvas) this catches the "no clipping needed" path in O(N) numpy
+        # without touching any RLE data.
+        new_offsets: npt.NDArray[np.int32] = self._offsets + np.array(
+            [dx, dy], dtype=np.int32
+        )
+        x1s = new_offsets[:, 0]
+        y1s = new_offsets[:, 1]
+        x2s = x1s + self._crop_shapes[:, 1] - 1
+        y2s = y1s + self._crop_shapes[:, 0] - 1
+
+        needs_clip: npt.NDArray[np.bool_] = (
+            (x1s < 0) | (y1s < 0) | (x2s >= new_w) | (y2s >= new_h)
+        )
+
+        if not needs_clip.any():
+            # Fast path: pure offset arithmetic, no decode/re-encode needed.
+            return CompactMask(
+                list(self._rles),
+                self._crop_shapes.copy(),
+                new_offsets,
+                new_image_shape,
+            )
+
+        # Slow path: only decode+clip+re-encode the masks that actually overflow.
+        out_rles: list[npt.NDArray[np.int32]] = []
+        out_crop_shapes: list[tuple[int, int]] = []
+        out_offsets_list: list[tuple[int, int]] = []
+
+        for mask_idx in range(num_masks):
+            x1 = int(x1s[mask_idx])
+            y1 = int(y1s[mask_idx])
+            x2 = int(x2s[mask_idx])
+            y2 = int(y2s[mask_idx])
+
+            if not needs_clip[mask_idx]:
+                out_rles.append(self._rles[mask_idx])
+                out_crop_shapes.append(
+                    (
+                        int(self._crop_shapes[mask_idx, 0]),
+                        int(self._crop_shapes[mask_idx, 1]),
+                    )
+                )
+                out_offsets_list.append((x1, y1))
+                continue
+
+            ix1 = max(0, x1)
+            iy1 = max(0, y1)
+            ix2 = min(new_w - 1, x2)
+            iy2 = min(new_h - 1, y2)
+
+            if ix1 > ix2 or iy1 > iy2:
+                anchor_x = min(max(x1, 0), new_w - 1)
+                anchor_y = min(max(y1, 0), new_h - 1)
+                out_rles.append(_rle_encode(np.zeros((1, 1), dtype=bool)))
+                out_crop_shapes.append((1, 1))
+                out_offsets_list.append((anchor_x, anchor_y))
+                continue
+
+            crop = self.crop(mask_idx)
+            clipped = crop[iy1 - y1 : iy2 - y1 + 1, ix1 - x1 : ix2 - x1 + 1]
+            out_rles.append(_rle_encode(clipped))
+            out_crop_shapes.append((iy2 - iy1 + 1, ix2 - ix1 + 1))
+            out_offsets_list.append((ix1, iy1))
+
+        return CompactMask(
+            out_rles,
+            np.array(out_crop_shapes, dtype=np.int32),
+            np.array(out_offsets_list, dtype=np.int32),
+            new_image_shape,
+        )
diff --git a/src/supervision/detection/core.py b/src/supervision/detection/core.py
index baabc56854..3798ee547b 100644
--- a/src/supervision/detection/core.py
+++ b/src/supervision/detection/core.py
@@ -4,7 +4,7 @@
 from dataclasses import dataclass, field
 from enum import Enum
 from functools import reduce
-from typing import Any, cast
+from typing import TYPE_CHECKING, Any, cast
 
 import numpy as np
 import numpy.typing as npt
@@ -59,6 +59,9 @@
 from supervision.utils.internal import deprecated, get_instance_variables
 from supervision.validators import validate_detections_fields, validate_resolution
 
+if TYPE_CHECKING:
+    from supervision.detection.compact_mask import CompactMask
+
 
 @dataclass
 class Detections:
@@ -133,7 +136,8 @@ class simplifies data manipulation and filtering, providing a uniform API for
         xyxy: An array of shape `(n, 4)` containing
             the bounding boxes coordinates in format `[x1, y1, x2, y2]`
         mask: An array of shape `(n, H, W)` containing the segmentation masks
-            (`bool` data type), or `None` when masks are not available.
+            (`bool` data type), or `None` when masks are not available, or as
+            :class:`~supervision.detection.compact_mask.CompactMask`.
         confidence: An array of shape `(n,)` containing the confidence scores
             of the detections, or `None` when confidence values are not available.
         class_id: An array of shape `(n,)` containing the class ids of the
@@ -149,7 +153,7 @@ class simplifies data manipulation and filtering, providing a uniform API for
     """  # noqa: E501 // docs
 
     xyxy: npt.NDArray[np.generic]
-    mask: npt.NDArray[np.generic] | None = None
+    mask: npt.NDArray[np.generic] | CompactMask | None = None
     confidence: npt.NDArray[np.generic] | None = None
     class_id: npt.NDArray[np.generic] | None = None
     tracker_id: npt.NDArray[np.generic] | None = None
@@ -2073,6 +2077,11 @@ def is_empty(self) -> bool:
         """
         Returns `True` if the `Detections` object is considered empty.
         """
+        # Fast path: avoids __eq__ which calls np.array_equal(to_dense(), ...)
+        # and would materialise the entire (N, H, W) CompactMask to a dense
+        # array just to check emptiness — O(N·H·W) for an O(1) check.
+        if len(self.xyxy) > 0:
+            return False
         empty_detections = Detections.empty()
         empty_detections.data = self.data
         empty_detections.metadata = self.metadata
@@ -2150,16 +2159,22 @@ def merge(cls, detections_list: list[Detections]) -> Detections:
 
         xyxy = np.vstack([d.xyxy for d in detections_list])
 
-        def stack_or_none(name: str) -> npt.NDArray[np.generic] | None:
+        def stack_or_none(
+            name: str,
+        ) -> npt.NDArray[np.generic] | CompactMask | None:
             if all(d.__getattribute__(name) is None for d in detections_list):
                 return None
             if any(d.__getattribute__(name) is None for d in detections_list):
                 raise ValueError(f"All or none of the '{name}' fields must be None")
-            return (
-                np.vstack([d.__getattribute__(name) for d in detections_list])
-                if name == "mask"
-                else np.hstack([d.__getattribute__(name) for d in detections_list])
-            )
+            if name == "mask":
+                from supervision.detection.compact_mask import CompactMask
+
+                masks = [d.__getattribute__(name) for d in detections_list]
+                if all(isinstance(m, CompactMask) for m in masks):
+                    return CompactMask.merge(masks)
+                # Mixed or all-ndarray: __array__ auto-converts any CompactMask.
+                return np.vstack([np.asarray(m) for m in masks])
+            return np.hstack([d.__getattribute__(name) for d in detections_list])
 
         mask = stack_or_none("mask")
         confidence = stack_or_none("confidence")
@@ -2281,7 +2296,7 @@ def __getitem__(
         """
         if isinstance(index, str):
             return self.data.get(index)
-        if self.is_empty():
+        if len(self) == 0:
             return self
         if isinstance(index, int):
             index = [index]
@@ -2343,6 +2358,10 @@ def area(self) -> npt.NDArray[np.generic]:
                 where n is the number of detections.
         """
         if self.mask is not None:
+            from supervision.detection.compact_mask import CompactMask
+
+            if isinstance(self.mask, CompactMask):
+                return self.mask.area
             return np.array([np.sum(mask) for mask in self.mask])
         else:
             return self.box_area
diff --git a/src/supervision/detection/tools/inference_slicer.py b/src/supervision/detection/tools/inference_slicer.py
index 4e0fcbf87e..79927641fd 100644
--- a/src/supervision/detection/tools/inference_slicer.py
+++ b/src/supervision/detection/tools/inference_slicer.py
@@ -45,9 +45,19 @@ def move_detections(
                 "Resolution width and height are required for moving segmentation "
                 "detections. This should be the same as (width, height) of image shape."
             )
-        detections.mask = move_masks(
-            masks=detections.mask, offset=offset, resolution_wh=resolution_wh
-        )
+        from supervision.detection.compact_mask import CompactMask
+
+        if isinstance(detections.mask, CompactMask):
+            # Preserve move_masks clipping semantics without dense materialisation.
+            detections.mask = detections.mask.with_offset(
+                dx=int(offset[0]),
+                dy=int(offset[1]),
+                new_image_shape=(resolution_wh[1], resolution_wh[0]),
+            )
+        else:
+            detections.mask = move_masks(
+                masks=detections.mask, offset=offset, resolution_wh=resolution_wh
+            )
     return detections
 
 
@@ -74,6 +84,15 @@ class InferenceSlicer:
         iou_threshold: IOU threshold used in merging overlap filtering.
         overlap_metric: Metric to compute overlap (`IOU` or `IOS`).
         thread_workers: Number of threads for concurrent slice inference.
+        compact_masks: If ``True``, dense ``(N, H, W)`` boolean mask
+            arrays returned by the callback are immediately converted to a
+            :class:`~supervision.detection.compact_mask.CompactMask`. This
+            keeps masks in run-length-encoded form for the entire pipeline —
+            merge, NMS, and annotation — avoiding the large ``(N, H, W)``
+            allocations that cause OOM on high-resolution images with many
+            objects. IoU and NMS are computed directly on the RLE crops
+            without ever materialising a full ``(N, H, W)`` array.
+            Defaults to ``False`` for backward compatibility.
 
     Raises:
         ValueError: If `slice_wh` or `overlap_wh` are invalid or inconsistent.
@@ -122,6 +141,7 @@ def __init__(
         iou_threshold: float = 0.5,
         overlap_metric: OverlapMetric | str = OverlapMetric.IOU,
         thread_workers: int = 1,
+        compact_masks: bool = False,
     ):
         slice_wh_norm = self._normalize_slice_wh(slice_wh)
         overlap_wh_norm = self._normalize_overlap_wh(overlap_wh)
@@ -135,6 +155,7 @@ def __init__(
         self.overlap_filter = OverlapFilter.from_value(overlap_filter)
         self.callback: Callable[[ImageType], Detections] = callback
         self.thread_workers = thread_workers
+        self.compact_masks = compact_masks
 
     def __call__(self, image: ImageType) -> Detections:
         """
@@ -196,8 +217,22 @@ def _run_callback(self, image: ImageType, offset: npt.NDArray[Any]) -> Detection
         """
         image_slice = crop_image(image=image, xyxy=offset)
         detections = self.callback(image_slice)
-        resolution_wh = get_image_resolution_wh(image)
 
+        if (
+            self.compact_masks
+            and detections.mask is not None
+            and isinstance(detections.mask, np.ndarray)
+        ):
+            from supervision.detection.compact_mask import CompactMask
+
+            slice_w, slice_h = get_image_resolution_wh(image_slice)
+            detections.mask = CompactMask.from_dense(
+                detections.mask,
+                detections.xyxy,
+                image_shape=(slice_h, slice_w),
+            )
+
+        resolution_wh = get_image_resolution_wh(image)
         detections = move_detections(
             detections=detections,
             offset=offset[:2],
diff --git a/src/supervision/detection/utils/iou_and_nms.py b/src/supervision/detection/utils/iou_and_nms.py
index d7c04f5f1d..8b37108320 100644
--- a/src/supervision/detection/utils/iou_and_nms.py
+++ b/src/supervision/detection/utils/iou_and_nms.py
@@ -30,7 +30,7 @@ class OverlapFilter(Enum):
 
     @classmethod
     def list(cls) -> list[str]:
-        return list(map(lambda c: c.value, cls))
+        return list(map(lambda member: member.value, cls))
 
     @classmethod
     def from_value(cls, value: OverlapFilter | str) -> OverlapFilter:
@@ -66,7 +66,7 @@ class OverlapMetric(Enum):
 
     @classmethod
     def list(cls) -> list[str]:
-        return list(map(lambda c: c.value, cls))
+        return list(map(lambda member: member.value, cls))
 
     @classmethod
     def from_value(cls, value: OverlapMetric | str) -> OverlapMetric:
@@ -351,9 +351,9 @@ def box_iou_batch_with_jaccard(
     ious: npt.NDArray[np.float64] = np.zeros(
         (len(boxes_detection), len(boxes_true)), dtype=np.float64
     )
-    for g_idx, g in enumerate(boxes_true):
-        for d_idx, d in enumerate(boxes_detection):
-            ious[d_idx, g_idx] = _jaccard(d, g, is_crowd[g_idx])
+    for gt_idx, gt_box in enumerate(boxes_true):
+        for det_idx, det_box in enumerate(boxes_detection):
+            ious[det_idx, gt_idx] = _jaccard(det_box, gt_box, is_crowd[gt_idx])
     return ious
 
 
@@ -385,19 +385,124 @@ def oriented_box_iou_batch(
     max_width = int(max(boxes_true[:, :, 1].max(), boxes_detection[:, :, 1].max()) + 1)
 
     mask_true = np.zeros((boxes_true.shape[0], max_height, max_width), dtype=np.uint8)
-    for i, box_true in enumerate(boxes_true):
-        mask_true[i] = polygon_to_mask(box_true, (max_width, max_height))
+    for box_idx, box_true in enumerate(boxes_true):
+        mask_true[box_idx] = polygon_to_mask(box_true, (max_width, max_height))
 
     mask_detection = np.zeros(
         (boxes_detection.shape[0], max_height, max_width), dtype=np.uint8
     )
-    for i, box_detection in enumerate(boxes_detection):
-        mask_detection[i] = polygon_to_mask(box_detection, (max_width, max_height))
+    for box_idx, box_detection in enumerate(boxes_detection):
+        mask_detection[box_idx] = polygon_to_mask(
+            box_detection, (max_width, max_height)
+        )
 
     ious = mask_iou_batch(mask_true, mask_detection)
     return ious
 
 
+def compact_mask_iou_batch(
+    masks_true: Any,
+    masks_detection: Any,
+    overlap_metric: OverlapMetric = OverlapMetric.IOU,
+) -> npt.NDArray[np.floating]:
+    """Compute pairwise overlap between two :class:`CompactMask` collections.
+
+    Avoids materialising full ``(N, H, W)`` arrays by:
+
+    1. Vectorised bounding-box pre-filter — pairs whose boxes do not overlap
+       get IoU = 0 without any mask decoding.
+    2. Sub-crop decoding — for overlapping pairs, only the intersection region
+       of each crop is decoded and compared.
+    3. Crop caching — each individual crop is decoded at most once even when it
+       participates in many pairs.
+
+    The result is numerically identical to running the dense
+    :func:`mask_iou_batch` on ``np.asarray(masks_true)`` /
+    ``np.asarray(masks_detection)``.
+
+    Args:
+        masks_true: :class:`~supervision.detection.compact_mask.CompactMask`
+            holding the ground-truth masks.
+        masks_detection: :class:`~supervision.detection.compact_mask.CompactMask`
+            holding the detection masks.
+        overlap_metric: :class:`OverlapMetric` — ``IOU`` or ``IOS``.
+
+    Returns:
+        Float array of shape ``(N1, N2)`` with pairwise overlap values.
+    """
+    n1: int = len(masks_true)
+    n2: int = len(masks_detection)
+    result: npt.NDArray[np.floating] = np.zeros((n1, n2), dtype=float)
+
+    if n1 == 0 or n2 == 0:
+        return result
+
+    areas_a: npt.NDArray[np.int64] = masks_true.area
+    areas_b: npt.NDArray[np.int64] = masks_detection.area
+
+    # Inclusive per-mask bounding boxes obtained from public accessors.
+    # bbox_xyxy: (N, 4) → (x1, y1, x2, y2)
+    bboxes_a: npt.NDArray[np.int32] = masks_true.bbox_xyxy.astype(np.int32)
+    x1a: npt.NDArray[np.int32] = bboxes_a[:, 0]
+    y1a: npt.NDArray[np.int32] = bboxes_a[:, 1]
+    x2a: npt.NDArray[np.int32] = bboxes_a[:, 2]
+    y2a: npt.NDArray[np.int32] = bboxes_a[:, 3]
+
+    bboxes_b: npt.NDArray[np.int32] = masks_detection.bbox_xyxy.astype(np.int32)
+    x1b: npt.NDArray[np.int32] = bboxes_b[:, 0]
+    y1b: npt.NDArray[np.int32] = bboxes_b[:, 1]
+    x2b: npt.NDArray[np.int32] = bboxes_b[:, 2]
+    y2b: npt.NDArray[np.int32] = bboxes_b[:, 3]
+
+    # Pairwise intersection bounding box — shape (N1, N2).
+    ix1: npt.NDArray[np.int32] = np.maximum(x1a[:, None], x1b[None, :])
+    iy1: npt.NDArray[np.int32] = np.maximum(y1a[:, None], y1b[None, :])
+    ix2: npt.NDArray[np.int32] = np.minimum(x2a[:, None], x2b[None, :])
+    iy2: npt.NDArray[np.int32] = np.minimum(y2a[:, None], y2b[None, :])
+    bbox_overlap: npt.NDArray[np.bool_] = (ix1 <= ix2) & (iy1 <= iy2)
+
+    # Decode each crop at most once, even if it participates in many pairs.
+    crops_a: dict[int, npt.NDArray[np.bool_]] = {}
+    crops_b: dict[int, npt.NDArray[np.bool_]] = {}
+
+    for idx_pair in np.argwhere(bbox_overlap):
+        idx_a, idx_b = int(idx_pair[0]), int(idx_pair[1])
+
+        if idx_a not in crops_a:
+            crops_a[idx_a] = masks_true.crop(idx_a)
+        if idx_b not in crops_b:
+            crops_b[idx_b] = masks_detection.crop(idx_b)
+
+        lx1 = int(ix1[idx_a, idx_b])
+        ly1 = int(iy1[idx_a, idx_b])
+        lx2 = int(ix2[idx_a, idx_b])
+        ly2 = int(iy2[idx_a, idx_b])
+
+        ox_a, oy_a = int(x1a[idx_a]), int(y1a[idx_a])
+        sub_a = crops_a[idx_a][ly1 - oy_a : ly2 - oy_a + 1, lx1 - ox_a : lx2 - ox_a + 1]
+
+        ox_b, oy_b = int(x1b[idx_b]), int(y1b[idx_b])
+        sub_b = crops_b[idx_b][ly1 - oy_b : ly2 - oy_b + 1, lx1 - ox_b : lx2 - ox_b + 1]
+
+        inter = int(np.logical_and(sub_a, sub_b).sum())
+        area_a_i = int(areas_a[idx_a])
+        area_b_j = int(areas_b[idx_b])
+
+        if overlap_metric == OverlapMetric.IOU:
+            union = area_a_i + area_b_j - inter
+            result[idx_a, idx_b] = inter / union if union > 0 else 0.0
+        elif overlap_metric == OverlapMetric.IOS:
+            small = min(area_a_i, area_b_j)
+            result[idx_a, idx_b] = inter / small if small > 0 else 0.0
+        else:
+            raise ValueError(
+                f"overlap_metric {overlap_metric} is not supported, "
+                "only 'IOU' and 'IOS' are supported"
+            )
+
+    return result
+
+
 def _mask_iou_batch_split(
     masks_true: npt.NDArray[Any],
     masks_detection: npt.NDArray[Any],
@@ -461,16 +566,34 @@ def mask_iou_batch(
     Compute Intersection over Union (IoU) of two sets of masks -
         `masks_true` and `masks_detection`.
 
+    Accepts both dense ``(N, H, W)`` boolean arrays and
+    :class:`~supervision.detection.compact_mask.CompactMask` objects.
+    When both inputs are :class:`~supervision.detection.compact_mask.CompactMask`,
+    the computation uses :func:`compact_mask_iou_batch` to avoid materialising
+    full ``(N, H, W)`` arrays.
+
     Args:
         masks_true: 3D `np.ndarray` representing ground-truth masks.
         masks_detection: 3D `np.ndarray` representing detection masks.
         overlap_metric: Metric used to compute the degree of overlap
             between pairs of masks (e.g., IoU, IoS).
         memory_limit: Memory limit in MB, default is 1024 * 5 MB (5GB).
+            Ignored when both inputs are CompactMask.
 
     Returns:
         Pairwise IoU of masks from `masks_true` and `masks_detection`.
     """
+    from supervision.detection.compact_mask import CompactMask
+
+    if isinstance(masks_true, CompactMask) and isinstance(masks_detection, CompactMask):
+        return compact_mask_iou_batch(masks_true, masks_detection, overlap_metric)
+
+    # Materialise any CompactMask that was passed alongside a dense array.
+    if isinstance(masks_true, CompactMask):
+        masks_true = np.asarray(masks_true)
+    if isinstance(masks_detection, CompactMask):
+        masks_detection = np.asarray(masks_detection)
+
     memory = (
         masks_true.shape[0]
         * masks_true.shape[1]
@@ -494,10 +617,12 @@ def mask_iou_batch(
         ),
         1,
     )
-    for i in range(0, masks_true.shape[0], step):
+    for chunk_start in range(0, masks_true.shape[0], step):
         ious.append(
             _mask_iou_batch_split(
-                masks_true[i : i + step], masks_detection, overlap_metric
+                masks_true[chunk_start : chunk_start + step],
+                masks_detection,
+                overlap_metric,
             )
         )
 
@@ -514,6 +639,11 @@ def mask_non_max_suppression(
     """
     Perform Non-Maximum Suppression (NMS) on segmentation predictions.
 
+    IoU is computed exactly on the full-resolution masks for both dense and
+    :class:`~supervision.detection.compact_mask.CompactMask` inputs.  The
+    ``mask_dimension`` parameter is kept for backward compatibility but is no
+    longer used — dense masks are **not** resized before IoU computation.
+
     Args:
         predictions: A 2D array of object detection predictions in
             the format of `(x_min, y_min, x_max, y_max, score)`
@@ -526,8 +656,8 @@ def mask_non_max_suppression(
             to use for non-maximum suppression.
         overlap_metric: Metric used to compute the degree of overlap
             between pairs of masks (e.g., IoU, IoS).
-        mask_dimension: The dimension to which the masks should be
-            resized before computing IOU values. Defaults to 640.
+        mask_dimension: Deprecated, no longer used. Kept for backward
+            compatibility.
 
     Returns:
         A boolean array indicating which predictions to keep after
@@ -549,15 +679,19 @@ def mask_non_max_suppression(
     sort_index = predictions[:, 4].argsort()[::-1]
     predictions = predictions[sort_index]
     masks = masks[sort_index]
-    masks_resized = resize_masks(masks, mask_dimension)
-    ious = mask_iou_batch(masks_resized, masks_resized, overlap_metric)
+
+    ious = mask_iou_batch(masks, masks, overlap_metric)
     categories = predictions[:, 5]
 
     keep = np.ones(rows, dtype=bool)
-    for i in range(rows):
-        if keep[i]:
-            condition = (ious[i] > iou_threshold) & (categories[i] == categories)
-            keep[i + 1 :] = np.where(condition[i + 1 :], False, keep[i + 1 :])
+    for row_idx in range(rows):
+        if keep[row_idx]:
+            condition = (ious[row_idx] > iou_threshold) & (
+                categories[row_idx] == categories
+            )
+            keep[row_idx + 1 :] = np.where(
+                condition[row_idx + 1 :], False, keep[row_idx + 1 :]
+            )
 
     return cast(npt.NDArray[np.bool_], keep[sort_index.argsort()])
 
@@ -712,7 +846,20 @@ def mask_non_max_merge(
         AssertionError: If `iou_threshold` is not within the closed
             range from `0` to `1`.
     """
-    masks_resized = resize_masks(masks, mask_dimension)
+    from supervision.detection.compact_mask import CompactMask
+
+    if isinstance(masks, CompactMask):
+        # _group_overlapping_masks needs dense arrays for logical_or union merging.
+        # Note: np.asarray(masks) first materialises a full-resolution (N, H, W)
+        # dense array before downscaling with resize_masks. This reduces the size
+        # of the array used for overlap computation but does not avoid the initial
+        # full-frame materialisation, which may still be memory-intensive for very
+        # large images or object counts.
+        masks = resize_masks(np.asarray(masks), mask_dimension)
+    else:
+        masks = resize_masks(masks, mask_dimension)
+    masks_resized = masks
+
     if predictions.shape[1] == 5:
         return _group_overlapping_masks(
             predictions, masks_resized, iou_threshold, overlap_metric
diff --git a/src/supervision/detection/utils/masks.py b/src/supervision/detection/utils/masks.py
index a618556ed0..018cbd4948 100644
--- a/src/supervision/detection/utils/masks.py
+++ b/src/supervision/detection/utils/masks.py
@@ -1,11 +1,14 @@
 from __future__ import annotations
 
-from typing import Any, Literal, cast
+from typing import TYPE_CHECKING, Any, Literal, cast
 
 import cv2
 import numpy as np
 import numpy.typing as npt
 
+if TYPE_CHECKING:
+    from supervision.detection.compact_mask import CompactMask
+
 
 def move_masks(
     masks: npt.NDArray[np.bool_],
@@ -86,7 +89,7 @@ def move_masks(
 
 
 def calculate_masks_centroids(
-    masks: npt.NDArray[Any],
+    masks: npt.NDArray[Any] | CompactMask,
 ) -> npt.NDArray[np.int_]:
     """
     Calculate the centroids of binary masks in a tensor.
@@ -94,11 +97,38 @@ def calculate_masks_centroids(
     Args:
         masks: A 3D NumPy array of shape (num_masks, height, width).
             Each 2D array in the tensor represents a binary mask.
+            Also accepts a :class:`~supervision.detection.compact_mask.CompactMask`.
 
     Returns:
         A 2D NumPy array of shape (num_masks, 2), where each row contains the x and y
             coordinates (in that order) of the centroid of the corresponding mask.
     """
+    from supervision.detection.compact_mask import CompactMask
+
+    if isinstance(masks, CompactMask):
+        # Compute centroids per-crop to avoid materialising the full (N, H, W) array.
+        n = len(masks)
+        if n == 0:
+            return cast(npt.NDArray[np.int_], np.empty((0, 2), dtype=int))
+
+        centroids: npt.NDArray[np.float64] = np.zeros((n, 2), dtype=np.float64)
+        for i in range(n):
+            crop = masks.crop(i)
+            crop_h, crop_w = crop.shape
+            x1 = int(masks.offsets[i, 0])
+            y1 = int(masks.offsets[i, 1])
+            total = int(crop.sum())
+            if total == 0:
+                centroids[i] = [0.0, 0.0]
+                continue
+            # Match the +0.5 offset used by the dense implementation.
+            crop_rows, crop_cols = np.indices((crop_h, crop_w))
+            cx = float(np.sum((crop_cols + 0.5)[crop])) / total + x1
+            cy = float(np.sum((crop_rows + 0.5)[crop])) / total + y1
+            centroids[i] = [cx, cy]
+
+        return cast(npt.NDArray[np.int_], centroids.astype(int))
+
     _num_masks, height, width = masks.shape
     total_pixels = masks.sum(axis=(1, 2))
 
@@ -339,7 +369,7 @@ def filter_segments_by_distance(
 
         ```
 
-        The nearby 2×2 block at columns 6–7 is kept because its edge distance
+        The nearby 2x2 block at columns 6-7 is kept because its edge distance
         is within 3 pixels. The distant block at columns 9-10 is removed.
     """  # noqa E501 // docs
     if mask.dtype != bool:
diff --git a/src/supervision/metrics/utils/object_size.py b/src/supervision/metrics/utils/object_size.py
index ad9f37b56f..84482580a0 100644
--- a/src/supervision/metrics/utils/object_size.py
+++ b/src/supervision/metrics/utils/object_size.py
@@ -10,6 +10,7 @@
 from supervision.metrics.core import MetricTarget
 
 if TYPE_CHECKING:
+    from supervision.detection.compact_mask import CompactMask
     from supervision.detection.core import Detections
 
 SIZE_THRESHOLDS = (32**2, 96**2)
@@ -122,12 +123,15 @@ def get_bbox_size_category(xyxy: npt.NDArray[np.float32]) -> npt.NDArray[np.int_
     return result
 
 
-def get_mask_size_category(mask: npt.NDArray[np.bool_]) -> npt.NDArray[np.int_]:
+def get_mask_size_category(
+    mask: npt.NDArray[np.bool_] | CompactMask,
+) -> npt.NDArray[np.int_]:
     """
     Get the size category of detection masks.
 
     Args:
-        mask: The mask array shaped (N, H, W).
+        mask: The mask array shaped (N, H, W), or a
+            :class:`~supervision.detection.compact_mask.CompactMask`.
 
     Returns:
         The size category of each mask, matching
@@ -146,10 +150,14 @@ def get_mask_size_category(mask: npt.NDArray[np.bool_]) -> npt.NDArray[np.int_]:
 
         ```
     """
-    if len(mask.shape) != 3:
-        raise ValueError("Masks must be shaped (N, H, W)")
-
-    areas = np.sum(mask, axis=(1, 2))
+    from supervision.detection.compact_mask import CompactMask
+
+    if isinstance(mask, CompactMask):
+        areas = mask.area
+    else:
+        if len(mask.shape) != 3:
+            raise ValueError("Masks must be shaped (N, H, W)")
+        areas = np.sum(mask, axis=(1, 2))
 
     result = np.full(areas.shape, ObjectSizeCategory.ANY.value)
     SM, LG = SIZE_THRESHOLDS
diff --git a/src/supervision/validators/__init__.py b/src/supervision/validators/__init__.py
index 1ab5449d11..75e200e72b 100644
--- a/src/supervision/validators/__init__.py
+++ b/src/supervision/validators/__init__.py
@@ -27,6 +27,14 @@ def validate_mask(mask: Any, n: int) -> None:
     if mask is None:
         return
 
+    # Fast path: CompactMask only needs a length check.
+    from supervision.detection.compact_mask import CompactMask
+
+    if isinstance(mask, CompactMask):
+        if len(mask) != n:
+            raise ValueError(f"mask must contain {n} masks, but got {len(mask)}")
+        return
+
     expected_shape = f"({n}, H, W)"
     actual_shape = str(getattr(mask, "shape", None))
     actual_dtype = getattr(mask, "dtype", None)
diff --git a/tests/detection/test_compact_mask.py b/tests/detection/test_compact_mask.py
new file mode 100644
index 0000000000..cb4e96730c
--- /dev/null
+++ b/tests/detection/test_compact_mask.py
@@ -0,0 +1,936 @@
+"""Unit tests for CompactMask and its private RLE helpers."""
+
+from __future__ import annotations
+
+from contextlib import ExitStack as DoesNotRaise
+
+import numpy as np
+import pytest
+
+from supervision.detection.compact_mask import (
+    CompactMask,
+    _rle_area,
+    _rle_decode,
+    _rle_encode,
+)
+from supervision.detection.utils.converters import mask_to_xyxy
+from supervision.detection.utils.masks import (
+    calculate_masks_centroids,
+    contains_holes,
+    contains_multiple_segments,
+    move_masks,
+)
+
+
+def _make_cm(masks: np.ndarray, image_shape: tuple[int, int]) -> CompactMask:
+    """Build a CompactMask whose crops equal the full bounding-box extents."""
+    num_masks = len(masks)
+    img_h, img_w = image_shape
+    xyxy = np.tile(np.array([0, 0, img_w, img_h], dtype=np.float32), (num_masks, 1))
+    return CompactMask.from_dense(masks, xyxy, image_shape=image_shape)
+
+
+class TestRleHelpers:
+    """Tests for _rle_encode, _rle_decode, and _rle_area.
+
+    Verifies that the private RLE encoding round-trips correctly for a range
+    of mask shapes (all-False, all-True, diagonal, L-shape, checkerboard,
+    single-pixel, and empty), and that _rle_area matches np.sum on the
+    original boolean array.
+    """
+
+    @pytest.mark.parametrize(
+        ("mask_2d", "description"),
+        [
+            (np.zeros((5, 5), dtype=bool), "all-False"),
+            (np.ones((5, 5), dtype=bool), "all-True"),
+            (np.eye(4, dtype=bool), "diagonal"),
+            (
+                np.array([[True, True, False], [True, False, False]], dtype=bool),
+                "L-shape",
+            ),
+            (
+                np.indices((4, 4)).sum(axis=0) % 2 == 0,
+                "checkerboard",
+            ),
+            (np.zeros((1, 1), dtype=bool), "single-pixel-False"),
+            (np.ones((1, 1), dtype=bool), "single-pixel-True"),
+            (np.zeros((0, 0), dtype=bool), "empty"),
+        ],
+    )
+    def test_encode_decode_round_trip(
+        self, mask_2d: np.ndarray, description: str
+    ) -> None:
+        if mask_2d.size == 0:
+            rle = _rle_encode(mask_2d)
+            assert _rle_area(rle) == 0
+            return
+
+        rle = _rle_encode(mask_2d)
+        assert rle.dtype == np.int32, "RLE must be int32"
+        reconstructed = _rle_decode(rle, mask_2d.shape[0], mask_2d.shape[1])
+        np.testing.assert_array_equal(
+            reconstructed, mask_2d, err_msg=f"Round-trip failed for: {description}"
+        )
+
+    @pytest.mark.parametrize(
+        "mask_2d",
+        [
+            np.zeros((6, 6), dtype=bool),
+            np.ones((6, 6), dtype=bool),
+            np.eye(6, dtype=bool),
+            np.array([[True, False, True], [False, True, False]], dtype=bool),
+        ],
+    )
+    def test_area_matches_numpy_sum(self, mask_2d: np.ndarray) -> None:
+        rle = _rle_encode(mask_2d)
+        assert _rle_area(rle) == int(np.sum(mask_2d))
+
+
+class TestFromDenseToDense:
+    """Tests for CompactMask.from_dense and to_dense.
+
+    Verifies that the from_dense → to_dense round-trip is lossless when the
+    bounding boxes span the full image (no True pixels fall outside the crop).
+    Covers N=0 (empty), N=1 (single mask), and N=5 (several random masks).
+    """
+
+    @pytest.mark.parametrize(
+        ("num_masks", "image_shape"),
+        [
+            (0, (50, 50)),
+            (1, (50, 50)),
+            (5, (50, 50)),
+        ],
+    )
+    def test_round_trip(self, num_masks: int, image_shape: tuple[int, int]) -> None:
+        rng = np.random.default_rng(42)
+        img_h, img_w = image_shape
+        masks = rng.integers(0, 2, size=(num_masks, img_h, img_w)).astype(bool)
+        cm = _make_cm(masks, image_shape)
+        np.testing.assert_array_equal(cm.to_dense(), masks)
+
+    def test_round_trip_with_mask_to_xyxy(self) -> None:
+        """Round-trip must be lossless with inclusive xyxy from mask_to_xyxy."""
+        img_h, img_w = 12, 14
+        masks = np.zeros((1, img_h, img_w), dtype=bool)
+        masks[0, 3:7, 4:9] = True  # non-full-image object
+
+        xyxy = mask_to_xyxy(masks).astype(np.float32)
+        cm = CompactMask.from_dense(masks, xyxy, image_shape=(img_h, img_w))
+
+        np.testing.assert_array_equal(cm.to_dense(), masks)
+
+
+class TestGetItem:
+    """Tests for CompactMask.__getitem__.
+
+    Covers four indexing modes:
+    - Integer index → dense (H, W) np.ndarray with correct shape and dtype.
+    - List of indices → new CompactMask with the selected detections.
+    - Slice → new CompactMask with the sliced detections.
+    - Boolean ndarray → new CompactMask filtered by the boolean selector.
+    """
+
+    def test_int_returns_2d_dense(self) -> None:
+        img_h, img_w = 30, 40
+        rng = np.random.default_rng(0)
+        masks = rng.integers(0, 2, size=(3, img_h, img_w)).astype(bool)
+        cm = _make_cm(masks, (img_h, img_w))
+
+        result = cm[1]
+        assert isinstance(result, np.ndarray)
+        assert result.shape == (img_h, img_w)
+        assert result.dtype == bool
+        np.testing.assert_array_equal(result, masks[1])
+
+    def test_list_returns_compact_mask(self) -> None:
+        img_h, img_w = 20, 20
+        masks = np.zeros((4, img_h, img_w), dtype=bool)
+        for mask_idx in range(4):
+            masks[
+                mask_idx,
+                mask_idx * 2 : mask_idx * 2 + 2,
+                mask_idx * 2 : mask_idx * 2 + 2,
+            ] = True
+        cm = _make_cm(masks, (img_h, img_w))
+
+        subset = cm[[0, 2]]
+        assert isinstance(subset, CompactMask)
+        assert len(subset) == 2
+        np.testing.assert_array_equal(subset[0], masks[0])
+        np.testing.assert_array_equal(subset[1], masks[2])
+
+    def test_slice_returns_compact_mask(self) -> None:
+        img_h, img_w = 20, 20
+        masks = np.zeros((5, img_h, img_w), dtype=bool)
+        cm = _make_cm(masks, (img_h, img_w))
+
+        subset = cm[1:4]
+        assert isinstance(subset, CompactMask)
+        assert len(subset) == 3
+
+    def test_bool_ndarray(self) -> None:
+        img_h, img_w = 15, 15
+        rng = np.random.default_rng(7)
+        masks = rng.integers(0, 2, size=(4, img_h, img_w)).astype(bool)
+        cm = _make_cm(masks, (img_h, img_w))
+
+        selector = np.array([True, False, True, False])
+        subset = cm[selector]
+        assert isinstance(subset, CompactMask)
+        assert len(subset) == 2
+        np.testing.assert_array_equal(subset[0], masks[0])
+        np.testing.assert_array_equal(subset[1], masks[2])
+
+    def test_bool_list(self) -> None:
+        """Python list[bool] should behave like boolean masking."""
+        img_h, img_w = 15, 15
+        rng = np.random.default_rng(8)
+        masks = rng.integers(0, 2, size=(4, img_h, img_w)).astype(bool)
+        cm = _make_cm(masks, (img_h, img_w))
+
+        subset = cm[[True, False, True, False]]
+        assert isinstance(subset, CompactMask)
+        assert len(subset) == 2
+        np.testing.assert_array_equal(subset[0], masks[0])
+        np.testing.assert_array_equal(subset[1], masks[2])
+
+
+class TestProperties:
+    """Tests for len, shape, dtype, and area properties.
+
+    Verifies that the shape tuple follows the (N, H, W) dense convention,
+    dtype is always bool, and area returns per-mask True-pixel counts that
+    match np.sum on the corresponding dense masks.
+    """
+
+    def test_len(self) -> None:
+        masks = np.zeros((3, 10, 10), dtype=bool)
+        cm = _make_cm(masks, (10, 10))
+        assert len(cm) == 3
+
+    def test_shape(self) -> None:
+        masks = np.zeros((3, 10, 10), dtype=bool)
+        cm = _make_cm(masks, (10, 10))
+        assert cm.shape == (3, 10, 10)
+
+    def test_shape_empty(self) -> None:
+        cm = CompactMask(
+            [],
+            np.empty((0, 2), dtype=np.int32),
+            np.empty((0, 2), dtype=np.int32),
+            (480, 640),
+        )
+        assert cm.shape == (0, 480, 640)
+
+    def test_dtype(self) -> None:
+        cm = _make_cm(np.zeros((1, 5, 5), dtype=bool), (5, 5))
+        assert cm.dtype == np.dtype(bool)
+
+    def test_area_matches_dense(self) -> None:
+        img_h, img_w = 20, 20
+        rng = np.random.default_rng(3)
+        masks = rng.integers(0, 2, size=(4, img_h, img_w)).astype(bool)
+        cm = _make_cm(masks, (img_h, img_w))
+
+        expected = np.array([mask.sum() for mask in masks])
+        np.testing.assert_array_equal(cm.area, expected)
+
+    def test_area_empty(self) -> None:
+        cm = CompactMask(
+            [],
+            np.empty((0, 2), dtype=np.int32),
+            np.empty((0, 2), dtype=np.int32),
+            (10, 10),
+        )
+        assert cm.area.shape == (0,)
+
+
+class TestCrop:
+    """Tests for CompactMask.crop.
+
+    Verifies that crop(index) returns an array shaped (crop_h, crop_w)
+    containing only the pixels within the bounding box, without allocating
+    the full (H, W) image.
+    """
+
+    def test_returns_crop_shape(self) -> None:
+        img_h, img_w = 50, 60
+        masks = np.zeros((1, img_h, img_w), dtype=bool)
+        masks[0, 10:30, 5:25] = True  # 20 x 20 region
+        xyxy = np.array([[5, 10, 24, 29]], dtype=np.float32)
+        cm = CompactMask.from_dense(masks, xyxy, image_shape=(img_h, img_w))
+
+        crop = cm.crop(0)
+        assert crop.shape == (20, 20)
+        assert crop.all()  # the entire crop should be True
+
+
+class TestArrayProtocol:
+    """Tests for the __array__ protocol.
+
+    Verifies that np.asarray(cm) materialises the full (N, H, W) dense array
+    and that optional dtype casting (e.g. to uint8) is correctly applied.
+    """
+
+    def test_array_protocol(self) -> None:
+        img_h, img_w = 10, 10
+        rng = np.random.default_rng(9)
+        masks = rng.integers(0, 2, size=(2, img_h, img_w)).astype(bool)
+        cm = _make_cm(masks, (img_h, img_w))
+
+        arr = np.asarray(cm)
+        assert arr.shape == (2, img_h, img_w)
+        np.testing.assert_array_equal(arr, masks)
+
+    def test_dtype_cast(self) -> None:
+        masks = np.ones((1, 5, 5), dtype=bool)
+        cm = _make_cm(masks, (5, 5))
+        arr = np.asarray(cm, dtype=np.uint8)
+        assert arr.dtype == np.uint8
+        assert arr.sum() == 25
+
+
+class TestMerge:
+    """Tests for CompactMask.merge.
+
+    Verifies that multiple CompactMask instances with the same image_shape
+    can be concatenated into a single CompactMask, that merging with an empty
+    instance works, that an empty input list raises ValueError, and that
+    mismatched image shapes raise ValueError.
+    """
+
+    def test_merge(self) -> None:
+        img_h, img_w = 20, 20
+        masks1 = np.zeros((2, img_h, img_w), dtype=bool)
+        masks2 = np.zeros((3, img_h, img_w), dtype=bool)
+        cm1 = _make_cm(masks1, (img_h, img_w))
+        cm2 = _make_cm(masks2, (img_h, img_w))
+
+        merged = CompactMask.merge([cm1, cm2])
+        assert len(merged) == 5
+        assert merged.shape == (5, img_h, img_w)
+        np.testing.assert_array_equal(
+            merged.to_dense(), np.concatenate([masks1, masks2], axis=0)
+        )
+
+    def test_merge_with_empty(self) -> None:
+        img_h, img_w = 10, 10
+        empty_cm = CompactMask(
+            [],
+            np.empty((0, 2), dtype=np.int32),
+            np.empty((0, 2), dtype=np.int32),
+            (img_h, img_w),
+        )
+        masks = np.zeros((2, img_h, img_w), dtype=bool)
+        cm = _make_cm(masks, (img_h, img_w))
+
+        merged = CompactMask.merge([empty_cm, cm])
+        assert len(merged) == 2
+
+    def test_merge_empty_list_raises(self) -> None:
+        with pytest.raises(ValueError, match="empty list"):
+            CompactMask.merge([])
+
+    def test_merge_mismatched_image_shape_raises(self) -> None:
+        cm1 = CompactMask(
+            [],
+            np.empty((0, 2), dtype=np.int32),
+            np.empty((0, 2), dtype=np.int32),
+            (10, 10),
+        )
+        cm2 = CompactMask(
+            [],
+            np.empty((0, 2), dtype=np.int32),
+            np.empty((0, 2), dtype=np.int32),
+            (20, 20),
+        )
+        with pytest.raises(ValueError, match="image shapes"):
+            CompactMask.merge([cm1, cm2])
+
+
+class TestEquality:
+    """Tests for CompactMask.__eq__.
+
+    Verifies element-wise equality between two CompactMask instances and
+    between a CompactMask and an equivalent dense (N, H, W) boolean array.
+    """
+
+    def test_eq_identical(self) -> None:
+        masks = np.zeros((2, 10, 10), dtype=bool)
+        masks[0, 2:5, 2:5] = True
+        cm1 = _make_cm(masks, (10, 10))
+        cm2 = _make_cm(masks, (10, 10))
+        assert cm1 == cm2
+
+    def test_eq_different(self) -> None:
+        masks_a = np.zeros((2, 10, 10), dtype=bool)
+        masks_a[0, 2:5, 2:5] = True
+        masks_b = np.zeros((2, 10, 10), dtype=bool)
+        masks_b[1, 6:9, 6:9] = True
+        cm1 = _make_cm(masks_a, (10, 10))
+        cm2 = _make_cm(masks_b, (10, 10))
+        assert not (cm1 == cm2)
+
+    def test_eq_with_dense_array(self) -> None:
+        masks = np.zeros((1, 8, 8), dtype=bool)
+        masks[0, 1:4, 1:4] = True
+        cm = _make_cm(masks, (8, 8))
+        assert cm == masks
+
+
+class TestEdgeCases:
+    """Tests for boundary conditions and unusual inputs.
+
+    Covers: zero-area bounding box (x1 == x2), masks that reach the image
+    edge, xyxy values beyond image dimensions (clamped silently), empty
+    CompactMask (N=0), sum axis compatibility with area, and with_offset for
+    use by InferenceSlicer.
+    """
+
+    def test_zero_area_mask_clipped_to_1x1(self) -> None:
+        """An invalid bounding box should not crash from_dense."""
+        masks = np.zeros((1, 10, 10), dtype=bool)
+        xyxy = np.array([[6, 5, 5, 8]], dtype=np.float32)
+        with DoesNotRaise():
+            cm = CompactMask.from_dense(masks, xyxy, image_shape=(10, 10))
+        assert len(cm) == 1
+
+    def test_mask_at_image_boundary(self) -> None:
+        img_h, img_w = 20, 20
+        masks = np.zeros((1, img_h, img_w), dtype=bool)
+        masks[0, 15:20, 15:20] = True
+        xyxy = np.array([[15, 15, 19, 19]], dtype=np.float32)
+        cm = CompactMask.from_dense(masks, xyxy, image_shape=(img_h, img_w))
+        np.testing.assert_array_equal(cm.to_dense(), masks)
+
+    def test_xyxy_beyond_image_clipped(self) -> None:
+        """xyxy values beyond the image boundary should be clipped silently."""
+        img_h, img_w = 10, 10
+        masks = np.zeros((1, img_h, img_w), dtype=bool)
+        masks[0, 5:10, 5:10] = True
+        xyxy = np.array([[5, 5, 999, 999]], dtype=np.float32)
+        with DoesNotRaise():
+            cm = CompactMask.from_dense(masks, xyxy, image_shape=(img_h, img_w))
+        np.testing.assert_array_equal(cm.to_dense(), masks)
+
+    def test_empty_compact_mask_to_dense(self) -> None:
+        cm = CompactMask(
+            [],
+            np.empty((0, 2), dtype=np.int32),
+            np.empty((0, 2), dtype=np.int32),
+            (50, 60),
+        )
+        dense = cm.to_dense()
+        assert dense.shape == (0, 50, 60)
+        assert dense.dtype == bool
+
+    def test_sum_axis_1_2_equals_area(self) -> None:
+        rng = np.random.default_rng(11)
+        masks = rng.integers(0, 2, size=(4, 15, 15)).astype(bool)
+        cm = _make_cm(masks, (15, 15))
+        np.testing.assert_array_equal(cm.sum(axis=(1, 2)), cm.area)
+
+    def test_with_offset(self) -> None:
+        img_h, img_w = 20, 20
+        masks = np.zeros((1, img_h, img_w), dtype=bool)
+        masks[0, 5:10, 5:10] = True
+        xyxy = np.array([[5, 5, 9, 9]], dtype=np.float32)
+        cm = CompactMask.from_dense(masks, xyxy, image_shape=(img_h, img_w))
+
+        cm2 = cm.with_offset(100, 200, new_image_shape=(400, 400))
+        assert cm2.offsets[0].tolist() == [105, 205]
+        assert cm2._image_shape == (400, 400)
+        np.testing.assert_array_equal(cm2.crop(0), cm.crop(0))
+
+    def test_with_offset_clips_partial_overlap_like_move_masks(self) -> None:
+        """with_offset must clip partial out-of-frame translations like move_masks."""
+        img_h, img_w = 10, 10
+        masks = np.zeros((1, img_h, img_w), dtype=bool)
+        masks[0, 2:6, 3:8] = True
+        xyxy = np.array([[3, 2, 7, 5]], dtype=np.float32)
+        cm = CompactMask.from_dense(masks, xyxy, image_shape=(img_h, img_w))
+
+        dx, dy = -4, 3
+        cm_shifted = cm.with_offset(dx=dx, dy=dy, new_image_shape=(img_h, img_w))
+        expected = move_masks(
+            masks=masks,
+            offset=np.array([dx, dy], dtype=np.int32),
+            resolution_wh=(img_w, img_h),
+        )
+
+        np.testing.assert_array_equal(cm_shifted.to_dense(), expected)
+
+    def test_with_offset_clips_full_outside_like_move_masks(self) -> None:
+        """Masks shifted fully outside should remain valid and decode to all-False."""
+        img_h, img_w = 10, 10
+        masks = np.zeros((1, img_h, img_w), dtype=bool)
+        masks[0, 2:6, 2:6] = True
+        xyxy = np.array([[2, 2, 5, 5]], dtype=np.float32)
+        cm = CompactMask.from_dense(masks, xyxy, image_shape=(img_h, img_w))
+
+        dx, dy = 100, 100
+        cm_shifted = cm.with_offset(dx=dx, dy=dy, new_image_shape=(img_h, img_w))
+        expected = move_masks(
+            masks=masks,
+            offset=np.array([dx, dy], dtype=np.int32),
+            resolution_wh=(img_w, img_h),
+        )
+
+        np.testing.assert_array_equal(cm_shifted.to_dense(), expected)
+
+    def test_repack_tightens_loose_bbox(self) -> None:
+        """repack() shrinks the crop to the minimal True-pixel rectangle."""
+        img_h, img_w = 20, 20
+        masks = np.zeros((1, img_h, img_w), dtype=bool)
+        masks[0, 5:10, 6:12] = True  # True block at (5,6)-(9,11)
+
+        # Deliberately loose bbox covers full image.
+        xyxy = np.array([[0, 0, img_w - 1, img_h - 1]], dtype=np.float32)
+        cm = CompactMask.from_dense(masks, xyxy, image_shape=(img_h, img_w))
+
+        # Before repack: crop is the full 20x20 image.
+        assert cm._crop_shapes[0].tolist() == [20, 20]
+
+        repacked = cm.repack()
+
+        # After repack: crop is exactly the True block.
+        assert repacked.offsets[0].tolist() == [6, 5]  # (x1, y1)
+        assert repacked._crop_shapes[0].tolist() == [5, 6]  # (h, w)
+        # Pixel content must be identical to the original.
+        np.testing.assert_array_equal(repacked.to_dense(), masks)
+
+    def test_repack_preserves_all_false_mask(self) -> None:
+        """repack() normalises an all-False mask to a 1x1 crop."""
+        img_h, img_w = 10, 10
+        masks = np.zeros((2, img_h, img_w), dtype=bool)
+        masks[1, 3:6, 3:6] = True  # only mask 1 is non-empty
+
+        xyxy = np.array([[0, 0, 9, 9], [0, 0, 9, 9]], dtype=np.float32)
+        cm = CompactMask.from_dense(masks, xyxy, image_shape=(img_h, img_w))
+        repacked = cm.repack()
+
+        assert repacked._crop_shapes[0].tolist() == [1, 1]  # normalised
+        assert repacked._crop_shapes[1].tolist() == [3, 3]  # tight True block
+        np.testing.assert_array_equal(repacked.to_dense(), masks)
+
+    def test_repack_empty_collection(self) -> None:
+        """repack() on an empty CompactMask returns another empty CompactMask."""
+        cm = CompactMask(
+            [],
+            np.empty((0, 2), dtype=np.int32),
+            np.empty((0, 2), dtype=np.int32),
+            (10, 10),
+        )
+        repacked = cm.repack()
+        assert len(repacked) == 0
+        assert repacked._image_shape == (10, 10)
+
+    def test_repack_already_tight(self) -> None:
+        """repack() is a no-op when bboxes are already tight."""
+        img_h, img_w = 15, 15
+        masks = np.zeros((1, img_h, img_w), dtype=bool)
+        masks[0, 4:9, 3:8] = True
+
+        # Tight bbox.
+        xyxy = np.array([[3, 4, 7, 8]], dtype=np.float32)
+        cm = CompactMask.from_dense(masks, xyxy, image_shape=(img_h, img_w))
+        repacked = cm.repack()
+
+        np.testing.assert_array_equal(repacked.offsets, cm.offsets)
+        np.testing.assert_array_equal(repacked._crop_shapes, cm._crop_shapes)
+        np.testing.assert_array_equal(repacked.to_dense(), masks)
+
+
+class TestCalculateMasksCentroidsCompact:
+    """Verify calculate_masks_centroids gives identical results for CompactMask.
+
+    The function has a dedicated CompactMask branch that computes centroids
+    per-crop.  Results must match the dense path to within integer rounding.
+    """
+
+    def test_centroids_compact_matches_dense(self) -> None:
+        """Centroid coordinates must be numerically identical for dense and compact."""
+        rng = np.random.default_rng(42)
+        img_h, img_w = 30, 30
+        masks = rng.integers(0, 2, size=(5, img_h, img_w)).astype(bool)
+        # Ensure each mask has at least one True pixel.
+        for mask_idx in range(5):
+            masks[mask_idx, mask_idx * 5, mask_idx * 5] = True
+
+        cm = _make_cm(masks, (img_h, img_w))
+
+        centroids_dense = calculate_masks_centroids(masks)
+        centroids_compact = calculate_masks_centroids(cm)
+
+        np.testing.assert_array_equal(centroids_compact, centroids_dense)
+
+    def test_centroids_empty_mask(self) -> None:
+        """All-zero masks should return centroid (0, 0) — same as dense."""
+        img_h, img_w = 10, 10
+        masks = np.zeros((3, img_h, img_w), dtype=bool)
+        cm = _make_cm(masks, (img_h, img_w))
+
+        centroids_dense = calculate_masks_centroids(masks)
+        centroids_compact = calculate_masks_centroids(cm)
+
+        np.testing.assert_array_equal(centroids_compact, centroids_dense)
+
+    def test_centroids_empty_mask_with_tight_bbox(self) -> None:
+        """All-zero tight crops must still return centroid (0, 0)."""
+        img_h, img_w = 10, 10
+        masks = np.zeros((1, img_h, img_w), dtype=bool)
+        xyxy = np.array([[3, 4, 7, 8]], dtype=np.float32)
+        cm = CompactMask.from_dense(masks, xyxy, image_shape=(img_h, img_w))
+
+        centroids_dense = calculate_masks_centroids(masks)
+        centroids_compact = calculate_masks_centroids(cm)
+
+        np.testing.assert_array_equal(centroids_compact, centroids_dense)
+
+    def test_centroids_zero_masks_returns_empty(self) -> None:
+        """Empty CompactMask (0 objects) must return shape (0, 2)."""
+        empty_cm = CompactMask(
+            [],
+            np.empty((0, 2), dtype=np.int32),
+            np.empty((0, 2), dtype=np.int32),
+            (10, 10),
+        )
+        result = calculate_masks_centroids(empty_cm)
+        assert result.shape == (0, 2)
+
+
+class TestContainsHolesCompact:
+    """Verify contains_holes result is unchanged after CompactMask roundtrip.
+
+    contains_holes works on a 2D boolean mask.  Encoding then decoding via
+    CompactMask must preserve pixel topology so that the function returns
+    the same result as on the original array.
+    """
+
+    @pytest.mark.parametrize(
+        ("mask_2d", "expected"),
+        [
+            # simple foreground blob — no holes
+            (
+                np.array(
+                    [[0, 1, 1, 0], [1, 1, 1, 1], [1, 1, 1, 1], [0, 1, 1, 0]],
+                    dtype=bool,
+                ),
+                False,
+            ),
+            # ring shape — has one hole
+            (
+                np.array(
+                    [[1, 1, 1, 0], [1, 0, 1, 0], [1, 1, 1, 0], [0, 0, 0, 0]],
+                    dtype=bool,
+                ),
+                True,
+            ),
+            # all-False — no holes
+            (np.zeros((6, 6), dtype=bool), False),
+            # all-True — no holes
+            (np.ones((6, 6), dtype=bool), False),
+        ],
+    )
+    def test_contains_holes_compact_roundtrip(
+        self, mask_2d: np.ndarray, expected: bool
+    ) -> None:
+        """contains_holes must agree after CompactMask encode→decode."""
+        img_h, img_w = mask_2d.shape
+        masks = mask_2d[np.newaxis]  # (1, H, W)
+        cm = _make_cm(masks, (img_h, img_w))
+
+        decoded = cm.to_dense()[0]
+        assert contains_holes(decoded) == expected
+        assert contains_holes(decoded) == contains_holes(mask_2d)
+
+
+class TestContainsMultipleSegmentsCompact:
+    """Verify contains_multiple_segments result survives CompactMask roundtrip.
+
+    Encoding and decoding must preserve connected-component topology so
+    that the multi-segment predicate returns the same value.
+    """
+
+    @pytest.mark.parametrize(
+        ("mask_2d", "connectivity", "expected"),
+        [
+            # single contiguous blob — not multi-segment
+            (
+                np.array(
+                    [[0, 1, 1, 0], [1, 1, 1, 1], [1, 1, 1, 1], [0, 1, 1, 0]],
+                    dtype=bool,
+                ),
+                4,
+                False,
+            ),
+            # two separate blobs — multi-segment
+            (
+                np.array(
+                    [[1, 1, 0, 0], [1, 1, 0, 0], [0, 0, 1, 1], [0, 0, 1, 1]],
+                    dtype=bool,
+                ),
+                4,
+                True,
+            ),
+            # diagonal touch — single segment under 8-connectivity
+            (
+                np.array(
+                    [[1, 1, 0, 0], [1, 1, 0, 1], [1, 0, 1, 1], [0, 0, 1, 1]],
+                    dtype=bool,
+                ),
+                8,
+                False,
+            ),
+            # all-False — not multi-segment
+            (np.zeros((6, 6), dtype=bool), 4, False),
+        ],
+    )
+    def test_contains_multiple_segments_compact_roundtrip(
+        self, mask_2d: np.ndarray, connectivity: int, expected: bool
+    ) -> None:
+        """contains_multiple_segments must agree after CompactMask encode→decode."""
+        img_h, img_w = mask_2d.shape
+        masks = mask_2d[np.newaxis]  # (1, H, W)
+        cm = _make_cm(masks, (img_h, img_w))
+
+        decoded = cm.to_dense()[0]
+        result = contains_multiple_segments(decoded, connectivity=connectivity)
+        assert result == expected
+        assert result == contains_multiple_segments(mask_2d, connectivity=connectivity)
+
+
+# ---------------------------------------------------------------------------
+# Random scenario helpers
+# ---------------------------------------------------------------------------
+
+# Varying (N, image_h, image_w) combinations for random tests.
+_RANDOM_CONFIGS = [
+    (1, 50, 50),
+    (5, 50, 50),
+    (5, 200, 300),
+    (20, 100, 150),
+    (20, 200, 300),
+    (50, 50, 50),
+    (5, 1080, 1920),
+    (1, 1080, 1920),
+    (20, 480, 640),
+    (50, 100, 100),
+]
+
+
+def _random_masks_and_xyxy(
+    rng: np.random.Generator,
+    num_masks: int,
+    img_h: int,
+    img_w: int,
+    fill_prob: float = 0.3,
+) -> tuple[np.ndarray, np.ndarray]:
+    """Generate *num_masks* random boolean masks with matching tight xyxy boxes.
+
+    Each mask is built by filling a random sub-rectangle with Bernoulli noise at
+    ``fill_prob``, then computing tight bounding boxes via ``mask_to_xyxy``.
+    This guarantees every mask has at least one True pixel (for non-degenerate
+    bounding boxes).
+    """
+    masks = np.zeros((num_masks, img_h, img_w), dtype=bool)
+    for mask_idx in range(num_masks):
+        y1 = rng.integers(0, img_h)
+        y2 = rng.integers(y1, img_h)
+        x1 = rng.integers(0, img_w)
+        x2 = rng.integers(x1, img_w)
+        region = rng.random((y2 - y1 + 1, x2 - x1 + 1)) < fill_prob
+        # Ensure at least one True pixel.
+        if not region.any():
+            region[0, 0] = True
+        masks[mask_idx, y1 : y2 + 1, x1 : x2 + 1] = region
+
+    xyxy = mask_to_xyxy(masks).astype(np.float32)
+    return masks, xyxy
+
+
+class TestCompactMaskRoundtripRandom:
+    """from_dense -> to_dense pixel equality across 10 random seeds.
+
+    Uses tight bounding boxes so the round-trip must be lossless (all True
+    pixels lie strictly within the crop).
+    """
+
+    @pytest.mark.parametrize("seed", list(range(10)))
+    def test_parity_seed(self, seed: int) -> None:
+        rng = np.random.default_rng(seed)
+        num_masks, img_h, img_w = _RANDOM_CONFIGS[seed]
+        masks, xyxy = _random_masks_and_xyxy(rng, num_masks, img_h, img_w)
+        cm = CompactMask.from_dense(masks, xyxy, image_shape=(img_h, img_w))
+        np.testing.assert_array_equal(
+            cm.to_dense(),
+            masks,
+            err_msg=(
+                f"Round-trip failed for seed={seed}, "
+                f"N={num_masks}, shape=({img_h},{img_w})"
+            ),
+        )
+
+    @pytest.mark.parametrize("seed", list(range(10)))
+    def test_shape_and_len(self, seed: int) -> None:
+        """len() and .shape must agree with the dense array."""
+        rng = np.random.default_rng(seed)
+        num_masks, img_h, img_w = _RANDOM_CONFIGS[seed]
+        masks, xyxy = _random_masks_and_xyxy(rng, num_masks, img_h, img_w)
+        cm = CompactMask.from_dense(masks, xyxy, image_shape=(img_h, img_w))
+        assert len(cm) == num_masks
+        assert cm.shape == (num_masks, img_h, img_w)
+
+    @pytest.mark.parametrize("seed", list(range(10)))
+    def test_individual_mask_access(self, seed: int) -> None:
+        """cm[i] must equal masks[i] for every index."""
+        rng = np.random.default_rng(seed)
+        num_masks, img_h, img_w = _RANDOM_CONFIGS[seed]
+        masks, xyxy = _random_masks_and_xyxy(rng, num_masks, img_h, img_w)
+        cm = CompactMask.from_dense(masks, xyxy, image_shape=(img_h, img_w))
+        for mask_idx in range(num_masks):
+            np.testing.assert_array_equal(
+                cm[mask_idx],
+                masks[mask_idx],
+                err_msg=f"cm[{mask_idx}] mismatch for seed={seed}",
+            )
+
+
+class TestCompactMaskAreaRandom:
+    """area from CompactMask equals dense .sum(axis=(1,2)) across 10 seeds."""
+
+    @pytest.mark.parametrize("seed", list(range(10)))
+    def test_parity_seed(self, seed: int) -> None:
+        rng = np.random.default_rng(seed)
+        num_masks, img_h, img_w = _RANDOM_CONFIGS[seed]
+        masks, xyxy = _random_masks_and_xyxy(rng, num_masks, img_h, img_w)
+        cm = CompactMask.from_dense(masks, xyxy, image_shape=(img_h, img_w))
+
+        expected_area = masks.sum(axis=(1, 2))
+        np.testing.assert_array_equal(
+            cm.area,
+            expected_area,
+            err_msg=(
+                f"Area mismatch for seed={seed}, N={num_masks}, shape=({img_h},{img_w})"
+            ),
+        )
+
+    @pytest.mark.parametrize("seed", list(range(10)))
+    def test_sum_axis_matches_area(self, seed: int) -> None:
+        """cm.sum(axis=(1,2)) must equal cm.area (the fast path)."""
+        rng = np.random.default_rng(seed)
+        num_masks, img_h, img_w = _RANDOM_CONFIGS[seed]
+        masks, xyxy = _random_masks_and_xyxy(rng, num_masks, img_h, img_w)
+        cm = CompactMask.from_dense(masks, xyxy, image_shape=(img_h, img_w))
+        np.testing.assert_array_equal(cm.sum(axis=(1, 2)), cm.area)
+
+
+class TestCompactMaskFilterRandom:
+    """Boolean filter on CompactMask matches dense fancy indexing across 10 seeds."""
+
+    @pytest.mark.parametrize("seed", list(range(10)))
+    def test_parity_seed(self, seed: int) -> None:
+        rng = np.random.default_rng(seed)
+        num_masks, img_h, img_w = _RANDOM_CONFIGS[seed]
+        masks, xyxy = _random_masks_and_xyxy(rng, num_masks, img_h, img_w)
+        cm = CompactMask.from_dense(masks, xyxy, image_shape=(img_h, img_w))
+
+        selector = rng.random(num_masks) > 0.5
+        # Guarantee at least one True in the selector so we test non-empty subsets.
+        if not selector.any():
+            selector[0] = True
+
+        subset_cm = cm[selector]
+        subset_dense = masks[selector]
+
+        assert isinstance(subset_cm, CompactMask)
+        assert len(subset_cm) == int(selector.sum())
+        np.testing.assert_array_equal(
+            subset_cm.to_dense(),
+            subset_dense,
+            err_msg=f"Boolean filter mismatch for seed={seed}",
+        )
+
+    @pytest.mark.parametrize("seed", list(range(10)))
+    def test_list_index(self, seed: int) -> None:
+        """Integer list indexing must match dense fancy indexing."""
+        rng = np.random.default_rng(seed)
+        num_masks, img_h, img_w = _RANDOM_CONFIGS[seed]
+        masks, xyxy = _random_masks_and_xyxy(rng, num_masks, img_h, img_w)
+        cm = CompactMask.from_dense(masks, xyxy, image_shape=(img_h, img_w))
+
+        num_selected = min(num_masks, max(1, rng.integers(1, num_masks + 1)))
+        indices = sorted(
+            rng.choice(num_masks, size=num_selected, replace=False).tolist()
+        )
+
+        subset_cm = cm[indices]
+        subset_dense = masks[indices]
+        np.testing.assert_array_equal(
+            subset_cm.to_dense(),
+            subset_dense,
+            err_msg=f"List index mismatch for seed={seed}, indices={indices}",
+        )
+
+
+class TestCompactMaskWithOffsetRandom:
+    """with_offset roundtrip matches move_masks across 10 random seeds."""
+
+    @pytest.mark.parametrize("seed", list(range(10)))
+    def test_parity_seed(self, seed: int) -> None:
+        rng = np.random.default_rng(seed)
+        # Use smaller images to keep move_masks fast.
+        num_masks = rng.integers(1, 10)
+        img_h, img_w = int(rng.integers(30, 80)), int(rng.integers(30, 80))
+        masks, xyxy = _random_masks_and_xyxy(rng, num_masks, img_h, img_w)
+        cm = CompactMask.from_dense(masks, xyxy, image_shape=(img_h, img_w))
+
+        # Random offset that may push some masks partially or fully off-frame.
+        dx = int(rng.integers(-img_w, img_w))
+        dy = int(rng.integers(-img_h, img_h))
+
+        cm_shifted = cm.with_offset(dx=dx, dy=dy, new_image_shape=(img_h, img_w))
+        expected = move_masks(
+            masks=masks,
+            offset=np.array([dx, dy], dtype=np.int32),
+            resolution_wh=(img_w, img_h),
+        )
+
+        np.testing.assert_array_equal(
+            cm_shifted.to_dense(),
+            expected,
+            err_msg=(
+                f"with_offset mismatch for seed={seed}, "
+                f"dx={dx}, dy={dy}, shape=({img_h},{img_w})"
+            ),
+        )
+
+    @pytest.mark.parametrize("seed", list(range(10)))
+    def test_offset_into_larger_canvas(self, seed: int) -> None:
+        """Offset into a larger destination image must preserve pixels."""
+        rng = np.random.default_rng(seed + 100)
+        num_masks = rng.integers(1, 8)
+        img_h, img_w = int(rng.integers(20, 50)), int(rng.integers(20, 50))
+        masks, xyxy = _random_masks_and_xyxy(rng, num_masks, img_h, img_w)
+        cm = CompactMask.from_dense(masks, xyxy, image_shape=(img_h, img_w))
+
+        new_h, new_w = img_h * 2, img_w * 2
+        dx = int(rng.integers(0, img_w))
+        dy = int(rng.integers(0, img_h))
+
+        cm_shifted = cm.with_offset(dx=dx, dy=dy, new_image_shape=(new_h, new_w))
+        dense_shifted = cm_shifted.to_dense()
+
+        assert dense_shifted.shape == (num_masks, new_h, new_w)
+        # Manually place each original mask into the larger canvas.
+        expected = np.zeros((num_masks, new_h, new_w), dtype=bool)
+        for mask_idx in range(num_masks):
+            expected[mask_idx, dy : dy + img_h, dx : dx + img_w] |= masks[mask_idx]
+
+        np.testing.assert_array_equal(
+            dense_shifted,
+            expected,
+            err_msg=f"Larger canvas offset mismatch for seed={seed}",
+        )
diff --git a/tests/detection/test_compact_mask_integration.py b/tests/detection/test_compact_mask_integration.py
new file mode 100644
index 0000000000..210ec8182c
--- /dev/null
+++ b/tests/detection/test_compact_mask_integration.py
@@ -0,0 +1,274 @@
+"""Integration tests: CompactMask <-> Detections, annotators, merge."""
+
+from __future__ import annotations
+
+from contextlib import ExitStack as DoesNotRaise
+
+import numpy as np
+import pytest
+
+import supervision as sv
+from supervision.detection.compact_mask import CompactMask
+from supervision.detection.core import Detections
+
+
+def _full_xyxy(n: int, h: int, w: int) -> np.ndarray:
+    """N boxes covering the whole image (ensures crop == full mask)."""
+    return np.tile(np.array([0, 0, w, h], dtype=np.float32), (n, 1))
+
+
+def _make_compact_detections(
+    n: int, h: int = 40, w: int = 40
+) -> tuple[Detections, np.ndarray]:
+    """Detections with a CompactMask backed by full-image bounding boxes.
+
+    Using full-image xyxy means all True pixels are within the crop region,
+    so from_dense -> to_dense is lossless.
+    """
+    rng = np.random.default_rng(42)
+    masks = rng.integers(0, 2, size=(n, h, w)).astype(bool)
+    xyxy = _full_xyxy(n, h, w)
+    cm = CompactMask.from_dense(masks, xyxy, image_shape=(h, w))
+    det = Detections(
+        xyxy=xyxy,
+        mask=cm,
+        confidence=np.ones(n, dtype=np.float32) * 0.9,
+        class_id=np.arange(n),
+    )
+    return det, masks
+
+
+class TestConstruction:
+    """Tests for building Detections with a CompactMask.
+
+    Verifies that a CompactMask is accepted as a valid mask argument and that
+    the validator raises ValueError when the mask length does not match the
+    number of bounding boxes.
+    """
+
+    def test_detections_construction_with_compact_mask(self) -> None:
+        with DoesNotRaise():
+            det, _ = _make_compact_detections(3)
+        assert isinstance(det.mask, CompactMask)
+        assert len(det) == 3
+
+    def test_detections_compact_mask_validation_mismatch(self) -> None:
+        n, h, w = 3, 20, 20
+        xyxy = _full_xyxy(n, h, w)
+        masks_wrong_n = np.zeros((n + 1, h, w), dtype=bool)
+        cm = CompactMask.from_dense(masks_wrong_n, _full_xyxy(n + 1, h, w), (h, w))
+        with pytest.raises(ValueError, match="mask must contain"):
+            Detections(xyxy=xyxy, mask=cm)
+
+
+class TestFiltering:
+    """Tests for Detections.__getitem__ with a CompactMask.
+
+    Verifies that integer, slice, and boolean-array indexing all preserve the
+    CompactMask type and return the correct subset of masks.
+    """
+
+    def test_int_wraps_to_compact_mask(self) -> None:
+        det, _ = _make_compact_detections(3)
+        # Detections converts int to [int] internally -> subset has 1 element
+        subset = det[1]
+        assert isinstance(subset.mask, CompactMask)
+        assert len(subset) == 1
+
+    def test_slice_preserves_compact_mask(self) -> None:
+        det, masks = _make_compact_detections(4)
+        subset = det[1:3]
+        assert isinstance(subset.mask, CompactMask)
+        assert len(subset) == 2
+        np.testing.assert_array_equal(subset.mask.to_dense(), masks[1:3])
+
+    def test_bool_array_preserves_compact_mask(self) -> None:
+        det, masks = _make_compact_detections(4)
+        selector = np.array([True, False, True, False])
+        subset = det[selector]
+        assert isinstance(subset.mask, CompactMask)
+        assert len(subset) == 2
+        np.testing.assert_array_equal(subset.mask.to_dense(), masks[[0, 2]])
+
+
+class TestIteration:
+    """Tests for iterating over Detections with a CompactMask.
+
+    Verifies that each iteration step yields a 2-D boolean (H, W) array
+    identical to the corresponding dense mask, so downstream code that
+    iterates over detections needs no changes.
+    """
+
+    def test_iter_yields_2d_dense(self) -> None:
+        h, w = 20, 20
+        det, masks = _make_compact_detections(3, h, w)
+        for i, (_, mask_2d, *_) in enumerate(det):
+            assert mask_2d is not None
+            assert isinstance(mask_2d, np.ndarray)
+            assert mask_2d.shape == (h, w)
+            assert mask_2d.dtype == bool
+            np.testing.assert_array_equal(mask_2d, masks[i])
+
+
+class TestEquality:
+    """Tests for Detections.__eq__ mixing CompactMask and dense arrays.
+
+    Verifies that a Detections object backed by a CompactMask compares equal
+    to an otherwise identical Detections object backed by a dense ndarray.
+    """
+
+    def test_compact_vs_dense(self) -> None:
+        h, w = 20, 20
+        det_compact, masks = _make_compact_detections(2, h, w)
+        xyxy = det_compact.xyxy.copy()
+        det_dense = Detections(
+            xyxy=xyxy,
+            mask=masks,
+            confidence=np.ones(2, dtype=np.float32) * 0.9,
+            class_id=np.arange(2),
+        )
+        assert det_compact == det_dense
+
+
+class TestArea:
+    """Tests for the Detections.area property with a CompactMask.
+
+    Verifies that the fast CompactMask path in Detections.area returns the
+    same per-detection pixel counts as summing the equivalent dense array.
+    """
+
+    def test_compact_matches_dense(self) -> None:
+        det_compact, masks = _make_compact_detections(3)
+        expected_area = np.array([m.sum() for m in masks])
+        np.testing.assert_array_equal(det_compact.area, expected_area)
+
+
+class TestMerge:
+    """Tests for merging Detections objects that contain CompactMask instances.
+
+    Covers three scenarios:
+    - All-compact merge: result is a CompactMask.
+    - Mixed compact + dense: result falls back to a dense ndarray.
+    - Inner pair merge (merge_inner_detection_object_pair): used during NMS-like
+      operations, each input must contain exactly one detection.
+    """
+
+    def test_all_compact(self) -> None:
+        h, w = 30, 30
+        det1, masks1 = _make_compact_detections(2, h, w)
+
+        rng = np.random.default_rng(7)
+        masks2 = rng.integers(0, 2, size=(3, h, w)).astype(bool)
+        xyxy2 = _full_xyxy(3, h, w)
+        cm2 = CompactMask.from_dense(masks2, xyxy2, (h, w))
+        det2 = Detections(
+            xyxy=xyxy2,
+            mask=cm2,
+            confidence=np.ones(3, dtype=np.float32) * 0.8,
+            class_id=np.arange(3),
+        )
+
+        merged = Detections.merge([det1, det2])
+        assert isinstance(merged.mask, CompactMask)
+        assert len(merged) == 5
+        expected = np.concatenate([masks1, masks2], axis=0)
+        np.testing.assert_array_equal(merged.mask.to_dense(), expected)
+
+    def test_mixed_compact_and_dense(self) -> None:
+        """Merging a CompactMask with a dense ndarray falls back to dense."""
+        h, w = 20, 20
+        det_compact, _ = _make_compact_detections(2, h, w)
+        masks_dense = np.zeros((1, h, w), dtype=bool)
+        xyxy_dense = _full_xyxy(1, h, w)
+        det_dense = Detections(
+            xyxy=xyxy_dense,
+            mask=masks_dense,
+            confidence=np.array([0.5], dtype=np.float32),
+            class_id=np.array([0]),
+        )
+
+        merged = Detections.merge([det_compact, det_dense])
+        assert isinstance(merged.mask, np.ndarray)
+        assert merged.mask.shape == (3, h, w)
+
+    def test_inner_pair_with_compact(self) -> None:
+        from supervision.detection.core import merge_inner_detection_object_pair
+
+        h, w = 20, 20
+        masks_a = np.zeros((1, h, w), dtype=bool)
+        masks_a[0, 0:5, 0:5] = True
+        xyxy_a = _full_xyxy(1, h, w)
+        cm_a = CompactMask.from_dense(masks_a, xyxy_a, (h, w))
+        det_a = Detections(
+            xyxy=xyxy_a,
+            mask=cm_a,
+            confidence=np.array([0.9], dtype=np.float32),
+            class_id=np.array([1]),
+        )
+
+        masks_b = np.zeros((1, h, w), dtype=bool)
+        masks_b[0, 5:10, 5:10] = True
+        xyxy_b = _full_xyxy(1, h, w)
+        cm_b = CompactMask.from_dense(masks_b, xyxy_b, (h, w))
+        det_b = Detections(
+            xyxy=xyxy_b,
+            mask=cm_b,
+            confidence=np.array([0.7], dtype=np.float32),
+            class_id=np.array([1]),
+        )
+
+        with DoesNotRaise():
+            result = merge_inner_detection_object_pair(det_a, det_b)
+        assert len(result) == 1
+
+
+class TestAnnotators:
+    """Tests for annotators that consume CompactMask via Detections.
+
+    Verifies that MaskAnnotator and PolygonAnnotator produce pixel-identical
+    output when given Detections backed by a CompactMask versus the equivalent
+    dense ndarray, confirming that the annotators are transparent to the mask
+    representation.
+    """
+
+    def test_mask_annotator(self) -> None:
+        h, w = 40, 40
+        det_compact, masks = _make_compact_detections(2, h, w)
+        det_dense = Detections(
+            xyxy=det_compact.xyxy.copy(),
+            mask=masks,
+            confidence=det_compact.confidence.copy(),
+            class_id=det_compact.class_id.copy(),
+        )
+
+        image = np.zeros((h, w, 3), dtype=np.uint8)
+        annotator = sv.MaskAnnotator(color_lookup=sv.ColorLookup.INDEX)
+
+        annotated_compact = annotator.annotate(image.copy(), det_compact)
+        annotated_dense = annotator.annotate(image.copy(), det_dense)
+
+        np.testing.assert_array_equal(
+            annotated_compact,
+            annotated_dense,
+            err_msg="MaskAnnotator output differs between CompactMask and dense mask",
+        )
+
+    def test_polygon_annotator(self) -> None:
+        h, w = 40, 40
+        # Use solid rectangular masks for stable polygon results.
+        masks = np.zeros((2, h, w), dtype=bool)
+        masks[0, 5:15, 5:15] = True
+        masks[1, 20:30, 20:30] = True
+        xyxy = _full_xyxy(2, h, w)
+        cm = CompactMask.from_dense(masks, xyxy, (h, w))
+
+        det_compact = Detections(xyxy=xyxy, mask=cm, class_id=np.array([0, 1]))
+        det_dense = Detections(xyxy=xyxy, mask=masks, class_id=np.array([0, 1]))
+
+        image = np.zeros((h, w, 3), dtype=np.uint8)
+        annotator = sv.PolygonAnnotator(color_lookup=sv.ColorLookup.INDEX)
+
+        annotated_compact = annotator.annotate(image.copy(), det_compact)
+        annotated_dense = annotator.annotate(image.copy(), det_dense)
+
+        np.testing.assert_array_equal(annotated_compact, annotated_dense)
diff --git a/tests/detection/test_compact_mask_iou.py b/tests/detection/test_compact_mask_iou.py
new file mode 100644
index 0000000000..dc4aed7ee9
--- /dev/null
+++ b/tests/detection/test_compact_mask_iou.py
@@ -0,0 +1,500 @@
+"""Correctness and integration tests for CompactMask IoU and NMS.
+
+These tests verify that:
+- compact_mask_iou_batch gives numerically identical results to the
+  dense mask_iou_batch (raster IoU) for all overlap patterns.
+- mask_iou_batch dispatches correctly when given CompactMask inputs.
+- mask_non_max_suppression and mask_non_max_merge work with CompactMask
+  and produce the same keep-set as when given equivalent dense arrays.
+"""
+
+from __future__ import annotations
+
+import numpy as np
+import pytest
+
+from supervision.detection.compact_mask import CompactMask
+from supervision.detection.utils.iou_and_nms import (
+    OverlapMetric,
+    compact_mask_iou_batch,
+    mask_iou_batch,
+    mask_non_max_merge,
+    mask_non_max_suppression,
+)
+
+# ---------------------------------------------------------------------------
+# Helpers
+# ---------------------------------------------------------------------------
+
+
+def _cm_from_masks(masks: np.ndarray, image_shape: tuple[int, int]) -> CompactMask:
+    """Build a CompactMask using full-image bounding boxes (lossless)."""
+    num_masks = len(masks)
+    img_h, img_w = image_shape
+    xyxy = np.tile(
+        np.array([0, 0, img_w - 1, img_h - 1], dtype=np.float32), (num_masks, 1)
+    )
+    return CompactMask.from_dense(masks, xyxy, image_shape=image_shape)
+
+
+def _cm_tight(masks: np.ndarray, image_shape: tuple[int, int]) -> CompactMask:
+    """Build a CompactMask using tight per-mask bounding boxes."""
+    from supervision.detection.utils.converters import mask_to_xyxy
+
+    xyxy = mask_to_xyxy(masks).astype(np.float32)
+    return CompactMask.from_dense(masks, xyxy, image_shape=image_shape)
+
+
+def _dense_iou(
+    masks_a: np.ndarray,
+    masks_b: np.ndarray,
+    metric: OverlapMetric = OverlapMetric.IOU,
+) -> np.ndarray:
+    """Reference pairwise IoU using the existing dense implementation."""
+    return mask_iou_batch(masks_a, masks_b, overlap_metric=metric)
+
+
+class TestCompactMaskIouBatch:
+    """Verify that compact_mask_iou_batch matches dense raster IoU exactly.
+
+    Every test builds a pair of CompactMask collections from known boolean
+    arrays, runs compact_mask_iou_batch, and compares the result to the dense
+    reference computed by mask_iou_batch on the raw numpy arrays.
+    """
+
+    def test_no_overlap_gives_zero(self) -> None:
+        """Non-overlapping masks should always produce IoU = 0."""
+        img_h, img_w = 20, 20
+        masks_a = np.zeros((1, img_h, img_w), dtype=bool)
+        masks_a[0, 0:5, 0:5] = True  # top-left
+
+        masks_b = np.zeros((1, img_h, img_w), dtype=bool)
+        masks_b[0, 10:15, 10:15] = True  # bottom-right
+
+        cm_a = _cm_from_masks(masks_a, (img_h, img_w))
+        cm_b = _cm_from_masks(masks_b, (img_h, img_w))
+
+        result = compact_mask_iou_batch(cm_a, cm_b)
+        assert result.shape == (1, 1)
+        assert result[0, 0] == pytest.approx(0.0)
+
+    def test_identical_masks_give_one(self) -> None:
+        """IoU of a mask with itself must be 1.0."""
+        img_h, img_w = 20, 20
+        masks = np.zeros((2, img_h, img_w), dtype=bool)
+        masks[0, 2:8, 2:8] = True
+        masks[1, 10:18, 10:18] = True
+
+        cm = _cm_from_masks(masks, (img_h, img_w))
+        result = compact_mask_iou_batch(cm, cm)
+
+        assert result.shape == (2, 2)
+        np.testing.assert_allclose(np.diag(result), [1.0, 1.0], atol=1e-9)
+
+    def test_matches_dense_random(self) -> None:
+        """compact_mask_iou_batch must be numerically identical to dense IoU."""
+        rng = np.random.default_rng(0)
+        img_h, img_w = 30, 30
+        masks_a = rng.integers(0, 2, size=(5, img_h, img_w)).astype(bool)
+        masks_b = rng.integers(0, 2, size=(4, img_h, img_w)).astype(bool)
+
+        cm_a = _cm_from_masks(masks_a, (img_h, img_w))
+        cm_b = _cm_from_masks(masks_b, (img_h, img_w))
+
+        compact_result = compact_mask_iou_batch(cm_a, cm_b)
+        dense_result = _dense_iou(masks_a, masks_b)
+
+        assert compact_result.shape == (5, 4)
+        np.testing.assert_allclose(compact_result, dense_result, atol=1e-9)
+
+    def test_matches_dense_with_tight_bboxes(self) -> None:
+        """Using tight bounding boxes (mask_to_xyxy) must still be accurate."""
+        rng = np.random.default_rng(1)
+        img_h, img_w = 40, 40
+        masks_a = rng.integers(0, 2, size=(4, img_h, img_w)).astype(bool)
+        masks_b = rng.integers(0, 2, size=(3, img_h, img_w)).astype(bool)
+
+        cm_a = _cm_tight(masks_a, (img_h, img_w))
+        cm_b = _cm_tight(masks_b, (img_h, img_w))
+
+        compact_result = compact_mask_iou_batch(cm_a, cm_b)
+        dense_result = _dense_iou(masks_a, masks_b)
+
+        np.testing.assert_allclose(compact_result, dense_result, atol=1e-9)
+
+    def test_partial_overlap(self) -> None:
+        """Partially overlapping masks: IoU should match the analytic value."""
+        img_h, img_w = 10, 10
+        # Mask A: columns 0-4 (5 wide), Mask B: columns 3-7 (5 wide).
+        # Overlap: columns 3-4 (2 wide) x full height (10 rows) = 20 px.
+        masks_a = np.zeros((1, img_h, img_w), dtype=bool)
+        masks_a[0, :, 0:5] = True  # area = 50
+
+        masks_b = np.zeros((1, img_h, img_w), dtype=bool)
+        masks_b[0, :, 3:8] = True  # area = 50
+
+        cm_a = _cm_from_masks(masks_a, (img_h, img_w))
+        cm_b = _cm_from_masks(masks_b, (img_h, img_w))
+
+        result = compact_mask_iou_batch(cm_a, cm_b)
+        # inter=20, union=50+50-20=80 → IoU=0.25
+        assert result[0, 0] == pytest.approx(0.25, abs=1e-9)
+        np.testing.assert_allclose(result, _dense_iou(masks_a, masks_b), atol=1e-9)
+
+    def test_ios_metric(self) -> None:
+        """IOS = intersection / min(area_a, area_b) must match dense reference."""
+        rng = np.random.default_rng(2)
+        img_h, img_w = 25, 25
+        masks_a = rng.integers(0, 2, size=(3, img_h, img_w)).astype(bool)
+        masks_b = rng.integers(0, 2, size=(3, img_h, img_w)).astype(bool)
+
+        cm_a = _cm_from_masks(masks_a, (img_h, img_w))
+        cm_b = _cm_from_masks(masks_b, (img_h, img_w))
+
+        compact_result = compact_mask_iou_batch(cm_a, cm_b, OverlapMetric.IOS)
+        dense_result = _dense_iou(masks_a, masks_b, OverlapMetric.IOS)
+
+        np.testing.assert_allclose(compact_result, dense_result, atol=1e-9)
+
+    def test_all_false_masks(self) -> None:
+        """Zero-area masks should produce IoU = 0, not NaN."""
+        img_h, img_w = 10, 10
+        masks_a = np.zeros((2, img_h, img_w), dtype=bool)
+        masks_b = np.zeros((2, img_h, img_w), dtype=bool)
+
+        cm_a = _cm_from_masks(masks_a, (img_h, img_w))
+        cm_b = _cm_from_masks(masks_b, (img_h, img_w))
+
+        result = compact_mask_iou_batch(cm_a, cm_b)
+        assert not np.any(np.isnan(result))
+        np.testing.assert_array_equal(result, 0.0)
+
+    def test_empty_inputs(self) -> None:
+        """Empty CompactMask collections should return a zero-shaped matrix."""
+        img_h, img_w = 10, 10
+        empty = CompactMask(
+            [],
+            np.empty((0, 2), dtype=np.int32),
+            np.empty((0, 2), dtype=np.int32),
+            (img_h, img_w),
+        )
+        masks = np.zeros((3, img_h, img_w), dtype=bool)
+        cm = _cm_from_masks(masks, (img_h, img_w))
+
+        result_a = compact_mask_iou_batch(empty, cm)
+        assert result_a.shape == (0, 3)
+
+        result_b = compact_mask_iou_batch(cm, empty)
+        assert result_b.shape == (3, 0)
+
+    def test_n_by_n_pairwise(self) -> None:
+        """N x N pairwise IoU: diagonal must be 1.0 for non-zero-area masks."""
+        img_h, img_w = 50, 50
+        rng = np.random.default_rng(3)
+        masks = rng.integers(0, 2, size=(8, img_h, img_w)).astype(bool)
+        # Ensure no all-false mask (diagonal would be undefined).
+        for mask_idx in range(8):
+            masks[mask_idx, mask_idx * 5, mask_idx * 5] = True
+
+        cm = _cm_from_masks(masks, (img_h, img_w))
+        result = compact_mask_iou_batch(cm, cm)
+
+        assert result.shape == (8, 8)
+        np.testing.assert_allclose(np.diag(result), 1.0, atol=1e-9)
+        np.testing.assert_allclose(result, _dense_iou(masks, masks), atol=1e-9)
+
+
+class TestMaskIouBatchDispatch:
+    """Verify mask_iou_batch dispatches correctly for CompactMask inputs.
+
+    When both arguments are CompactMask, the function must route to the
+    efficient RLE implementation and produce identical results to the dense
+    path.  When one argument is dense and the other is CompactMask, the
+    CompactMask must be materialised transparently before computation.
+    """
+
+    def test_both_compact_dispatches_to_rle(self) -> None:
+        img_h, img_w = 20, 20
+        rng = np.random.default_rng(10)
+        masks_a = rng.integers(0, 2, size=(3, img_h, img_w)).astype(bool)
+        masks_b = rng.integers(0, 2, size=(2, img_h, img_w)).astype(bool)
+
+        cm_a = _cm_from_masks(masks_a, (img_h, img_w))
+        cm_b = _cm_from_masks(masks_b, (img_h, img_w))
+
+        result_compact = mask_iou_batch(cm_a, cm_b)
+        result_dense = mask_iou_batch(masks_a, masks_b)
+
+        np.testing.assert_allclose(result_compact, result_dense, atol=1e-9)
+
+    def test_mixed_compact_and_dense(self) -> None:
+        """One CompactMask + one dense array must still work correctly."""
+        img_h, img_w = 20, 20
+        rng = np.random.default_rng(11)
+        masks_a = rng.integers(0, 2, size=(3, img_h, img_w)).astype(bool)
+        masks_b = rng.integers(0, 2, size=(2, img_h, img_w)).astype(bool)
+
+        cm_a = _cm_from_masks(masks_a, (img_h, img_w))
+
+        result = mask_iou_batch(cm_a, masks_b)
+        expected = mask_iou_batch(masks_a, masks_b)
+        np.testing.assert_allclose(result, expected, atol=1e-9)
+
+
+class TestNmsWithCompactMask:
+    """Verify mask NMS produces identical keep-sets for CompactMask and dense inputs.
+
+    Both paths now use exact full-resolution IoU — no resize approximation.
+    Tests use images larger than 640 px to ensure the old resize-to-640 path
+    would have introduced lossy approximation (catching the regression).
+    """
+
+    def test_nms_compact_matches_dense(self) -> None:
+        """NMS keep-set is identical for CompactMask and the equivalent dense array."""
+        # Use > 640 px so the old resize-to-640 path would have been lossy.
+        img_h, img_w = 720, 720
+        masks = np.zeros((3, img_h, img_w), dtype=bool)
+        masks[0, 0:360, 0:360] = True  # top-left
+        masks[1, 0:324, 0:324] = True  # heavily overlaps mask 0
+        masks[2, 360:720, 360:720] = True  # bottom-right, no overlap
+
+        scores = np.array([0.9, 0.8, 0.7])
+        predictions = np.column_stack(
+            [np.zeros((3, 4)), scores]  # dummy xyxy, real scores
+        )
+
+        cm = _cm_from_masks(masks, (img_h, img_w))
+
+        keep_dense = mask_non_max_suppression(predictions, masks, iou_threshold=0.3)
+        keep_compact = mask_non_max_suppression(predictions, cm, iou_threshold=0.3)
+
+        np.testing.assert_array_equal(keep_compact, keep_dense)
+
+    def test_nms_compact_matches_dense_borderline(self) -> None:
+        """Borderline IoU pair (≈ threshold) must agree — catches the resize bug.
+
+        With resize-to-640, sub-pixel rounding on a pair whose true IoU is very
+        close to the threshold flips the keep/suppress decision.  Both paths now
+        compute exact pixel-level IoU so results are identical.
+        """
+        img_h, img_w = 1080, 1920
+        masks = np.zeros((2, img_h, img_w), dtype=bool)
+        # Mask 0: 200x200 square; mask 1: shifted 141 px → true IoU ≈ 0.50.
+        masks[0, 100:300, 100:300] = True
+        masks[1, 241:441, 241:441] = True
+
+        scores = np.array([0.9, 0.8])
+        predictions = np.column_stack([np.zeros((2, 4)), scores])
+        cm = _cm_from_masks(masks, (img_h, img_w))
+
+        keep_dense = mask_non_max_suppression(predictions, masks, iou_threshold=0.5)
+        keep_compact = mask_non_max_suppression(predictions, cm, iou_threshold=0.5)
+
+        np.testing.assert_array_equal(keep_compact, keep_dense)
+
+    def test_nms_compact_no_suppression(self) -> None:
+        """Non-overlapping masks: all should be kept."""
+        img_h, img_w = 20, 20
+        masks = np.zeros((3, img_h, img_w), dtype=bool)
+        masks[0, 0:5, 0:5] = True
+        masks[1, 7:12, 7:12] = True
+        masks[2, 14:19, 14:19] = True
+
+        scores = np.array([0.9, 0.8, 0.7])
+        predictions = np.column_stack([np.zeros((3, 4)), scores])
+        cm = _cm_from_masks(masks, (img_h, img_w))
+
+        keep = mask_non_max_suppression(predictions, cm, iou_threshold=0.5)
+        assert keep.all(), "All non-overlapping masks should be kept"
+
+    def test_nms_compact_full_suppression(self) -> None:
+        """Identical masks: only the highest-confidence one should survive."""
+        img_h, img_w = 20, 20
+        mask = np.zeros((1, img_h, img_w), dtype=bool)
+        mask[0, 5:15, 5:15] = True
+
+        masks = np.repeat(mask, 3, axis=0)
+        scores = np.array([0.9, 0.8, 0.7])
+        predictions = np.column_stack([np.zeros((3, 4)), scores])
+        cm = _cm_from_masks(masks, (img_h, img_w))
+
+        keep = mask_non_max_suppression(predictions, cm, iou_threshold=0.5)
+        assert keep.sum() == 1
+        assert keep[0], "Highest-confidence mask should survive"
+
+
+class TestNmmWithCompactMask:
+    """Verify mask_non_max_merge produces the same groups for CompactMask and dense.
+
+    NMM materialises CompactMask to a downscaled dense array internally, so
+    results must be numerically identical to the dense path.
+    """
+
+    def test_nmm_compact_matches_dense(self) -> None:
+        """Merge groups must match between CompactMask and dense inputs."""
+        img_h, img_w = 40, 40
+        masks = np.zeros((3, img_h, img_w), dtype=bool)
+        masks[0, 0:20, 0:20] = True  # top-left
+        masks[1, 0:18, 0:18] = True  # heavily overlaps mask 0
+        masks[2, 20:40, 20:40] = True  # bottom-right, no overlap
+
+        scores = np.array([0.9, 0.8, 0.7])
+        predictions = np.column_stack([np.zeros((3, 4)), scores])
+        cm = _cm_from_masks(masks, (img_h, img_w))
+
+        groups_dense = mask_non_max_merge(predictions, masks, iou_threshold=0.3)
+        groups_compact = mask_non_max_merge(predictions, cm, iou_threshold=0.3)
+
+        def normalise(groups: list[list[int]]) -> list[list[int]]:
+            return sorted(sorted(group) for group in groups)
+
+        assert normalise(groups_compact) == normalise(groups_dense)
+
+    def test_nmm_no_merge(self) -> None:
+        """Non-overlapping masks: every mask should be its own group."""
+        img_h, img_w = 20, 20
+        masks = np.zeros((3, img_h, img_w), dtype=bool)
+        masks[0, 0:5, 0:5] = True
+        masks[1, 7:12, 7:12] = True
+        masks[2, 14:19, 14:19] = True
+
+        scores = np.array([0.9, 0.8, 0.7])
+        predictions = np.column_stack([np.zeros((3, 4)), scores])
+        cm = _cm_from_masks(masks, (img_h, img_w))
+
+        groups = mask_non_max_merge(predictions, cm, iou_threshold=0.5)
+        assert len(groups) == 3, "Each non-overlapping mask gets its own group"
+        assert all(len(group) == 1 for group in groups)
+
+    def test_nmm_full_merge(self) -> None:
+        """Identical masks: all predictions should merge into one group."""
+        img_h, img_w = 20, 20
+        single = np.zeros((1, img_h, img_w), dtype=bool)
+        single[0, 5:15, 5:15] = True
+        masks = np.repeat(single, 3, axis=0)
+
+        scores = np.array([0.9, 0.8, 0.7])
+        predictions = np.column_stack([np.zeros((3, 4)), scores])
+        cm = _cm_from_masks(masks, (img_h, img_w))
+
+        groups = mask_non_max_merge(predictions, cm, iou_threshold=0.5)
+        assert len(groups) == 1, "Identical masks must collapse to one group"
+        assert len(groups[0]) == 3
+
+
+# ---------------------------------------------------------------------------
+# Random scenario helpers
+# ---------------------------------------------------------------------------
+
+# Small (N, h, w) configs to keep IoU tests fast.
+_IOU_RANDOM_CONFIGS = [
+    (5, 30, 30),
+    (8, 40, 40),
+    (10, 25, 25),
+    (6, 50, 50),
+    (12, 30, 40),
+    (5, 60, 60),
+    (15, 20, 20),
+    (7, 35, 35),
+    (10, 40, 50),
+    (8, 45, 45),
+]
+
+
+def _random_masks(
+    rng: np.random.Generator,
+    num_masks: int,
+    img_h: int,
+    img_w: int,
+    fill_prob: float = 0.25,
+) -> np.ndarray:
+    """Generate *num_masks* random boolean masks with at least one True pixel each."""
+    masks = np.zeros((num_masks, img_h, img_w), dtype=bool)
+    for mask_idx in range(num_masks):
+        y1 = rng.integers(0, img_h)
+        y2 = rng.integers(y1, img_h)
+        x1 = rng.integers(0, img_w)
+        x2 = rng.integers(x1, img_w)
+        region = rng.random((y2 - y1 + 1, x2 - x1 + 1)) < fill_prob
+        if not region.any():
+            region[0, 0] = True
+        masks[mask_idx, y1 : y2 + 1, x1 : x2 + 1] = region
+    return masks
+
+
+class TestCompactMaskIouRandom:
+    """compact_mask_iou_batch matches dense mask_iou_batch across 10 random seeds.
+
+    Uses small mask counts (5-15) and image sizes (20x20 to 60x60) to keep
+    individual test runs under 1 second.
+    """
+
+    @pytest.mark.parametrize("seed", list(range(10)))
+    def test_parity_seed(self, seed: int) -> None:
+        rng = np.random.default_rng(seed)
+        num_masks_a, img_h, img_w = _IOU_RANDOM_CONFIGS[seed]
+        num_masks_b = max(3, num_masks_a - 2)
+
+        masks_a = _random_masks(rng, num_masks_a, img_h, img_w)
+        masks_b = _random_masks(rng, num_masks_b, img_h, img_w)
+
+        cm_a = _cm_from_masks(masks_a, (img_h, img_w))
+        cm_b = _cm_from_masks(masks_b, (img_h, img_w))
+
+        compact_result = compact_mask_iou_batch(cm_a, cm_b)
+        dense_result = _dense_iou(masks_a, masks_b)
+
+        assert compact_result.shape == (num_masks_a, num_masks_b), (
+            f"Shape mismatch: {compact_result.shape} vs ({num_masks_a}, {num_masks_b})"
+        )
+        np.testing.assert_allclose(
+            compact_result,
+            dense_result,
+            atol=1e-9,
+            err_msg=f"IoU mismatch: seed={seed}, N_a={num_masks_a}, N_b={num_masks_b}",
+        )
+
+    @pytest.mark.parametrize("seed", list(range(10)))
+    def test_self_iou_diagonal(self, seed: int) -> None:
+        """Self-IoU diagonal must be 1.0 for masks with at least one True pixel."""
+        rng = np.random.default_rng(seed + 50)
+        num_masks, img_h, img_w = _IOU_RANDOM_CONFIGS[seed]
+        masks = _random_masks(rng, num_masks, img_h, img_w)
+
+        cm = _cm_from_masks(masks, (img_h, img_w))
+        result = compact_mask_iou_batch(cm, cm)
+
+        np.testing.assert_allclose(
+            np.diag(result),
+            1.0,
+            atol=1e-9,
+            err_msg=f"Diagonal not 1.0 for seed={seed}",
+        )
+
+    @pytest.mark.parametrize("seed", list(range(10)))
+    def test_tight_bbox_parity(self, seed: int) -> None:
+        """Tight bounding boxes (mask_to_xyxy) must still produce identical IoU."""
+        from supervision.detection.utils.converters import mask_to_xyxy
+
+        rng = np.random.default_rng(seed + 200)
+        num_masks, img_h, img_w = _IOU_RANDOM_CONFIGS[seed]
+        num_masks_b = max(3, num_masks - 2)
+
+        masks_a = _random_masks(rng, num_masks, img_h, img_w)
+        masks_b = _random_masks(rng, num_masks_b, img_h, img_w)
+
+        xyxy_a = mask_to_xyxy(masks_a).astype(np.float32)
+        xyxy_b = mask_to_xyxy(masks_b).astype(np.float32)
+
+        cm_a = CompactMask.from_dense(masks_a, xyxy_a, image_shape=(img_h, img_w))
+        cm_b = CompactMask.from_dense(masks_b, xyxy_b, image_shape=(img_h, img_w))
+
+        compact_result = compact_mask_iou_batch(cm_a, cm_b)
+        dense_result = _dense_iou(masks_a, masks_b)
+
+        np.testing.assert_allclose(
+            compact_result,
+            dense_result,
+            atol=1e-9,
+            err_msg=f"Tight bbox IoU mismatch for seed={seed}",
+        )
diff --git a/tests/detection/test_inference_slicer_compact.py b/tests/detection/test_inference_slicer_compact.py
new file mode 100644
index 0000000000..4a4a3e5f3a
--- /dev/null
+++ b/tests/detection/test_inference_slicer_compact.py
@@ -0,0 +1,162 @@
+"""Integration tests for InferenceSlicer with compact_masks=True.
+
+Verifies that with compact_masks=True:
+- Masks stay as CompactMask throughout the pipeline (no dense materialisation).
+- NMS is computed via RLE IoU (no resize, no dense (N,H,W) alloc).
+- Final detections are pixel-identical to the compact_masks=False path.
+"""
+
+from __future__ import annotations
+
+import numpy as np
+
+import supervision as sv
+from supervision.detection.compact_mask import CompactMask
+from supervision.detection.core import Detections
+
+
+def _fake_seg_callback(tile: np.ndarray) -> Detections:
+    """Return two non-overlapping segmentation detections for any tile."""
+    h, w = tile.shape[:2]
+    masks = np.zeros((2, h, w), dtype=bool)
+    masks[0, : h // 3, : w // 3] = True
+    masks[1, h // 2 :, w // 2 :] = True
+    xyxy = np.array([[0, 0, w // 3, h // 3], [w // 2, h // 2, w, h]], dtype=np.float32)
+    return Detections(
+        xyxy=xyxy,
+        mask=masks,
+        confidence=np.array([0.9, 0.8], dtype=np.float32),
+        class_id=np.array([0, 1]),
+    )
+
+
+class TestInferenceSlicerCompactMasks:
+    """Tests that compact_masks=True keeps masks in RLE form end-to-end.
+
+    The pipeline inside InferenceSlicer goes:
+      callback → CompactMask.from_dense (tile coords)
+               → with_offset (full-image coords)
+               → CompactMask.merge (all tiles)
+               → mask_non_max_suppression → compact_mask_iou_batch (RLE IoU)
+
+    None of those steps materialise a full (N, H, W) dense array.
+    """
+
+    def test_compact_masks_flag_converts_dense_to_compact(self) -> None:
+        """Masks returned from callback are CompactMask after _run_callback."""
+        image = np.zeros((200, 200, 3), dtype=np.uint8)
+        slicer = sv.InferenceSlicer(
+            callback=_fake_seg_callback,
+            slice_wh=200,
+            overlap_wh=0,
+            overlap_filter=sv.OverlapFilter.NONE,
+            compact_masks=True,
+        )
+        result = slicer(image)
+        assert isinstance(result.mask, CompactMask), (
+            f"compact_masks=True must produce a CompactMask, got {type(result.mask)}"
+        )
+
+    def test_compact_masks_false_keeps_dense(self) -> None:
+        """Default (compact_masks=False) keeps dense ndarray masks."""
+        image = np.zeros((200, 200, 3), dtype=np.uint8)
+        slicer = sv.InferenceSlicer(
+            callback=_fake_seg_callback,
+            slice_wh=200,
+            overlap_wh=0,
+            overlap_filter=sv.OverlapFilter.NONE,
+            compact_masks=False,
+        )
+        result = slicer(image)
+        assert isinstance(result.mask, np.ndarray)
+        assert not isinstance(result.mask, CompactMask)
+
+    def test_compact_and_dense_pipelines_give_same_masks(self) -> None:
+        """compact_masks=True and False must produce pixel-identical final masks."""
+        image = np.zeros((300, 300, 3), dtype=np.uint8)
+
+        slicer_dense = sv.InferenceSlicer(
+            callback=_fake_seg_callback,
+            slice_wh=150,
+            overlap_wh=0,
+            overlap_filter=sv.OverlapFilter.NON_MAX_SUPPRESSION,
+            iou_threshold=0.3,
+            compact_masks=False,
+        )
+        slicer_compact = sv.InferenceSlicer(
+            callback=_fake_seg_callback,
+            slice_wh=150,
+            overlap_wh=0,
+            overlap_filter=sv.OverlapFilter.NON_MAX_SUPPRESSION,
+            iou_threshold=0.3,
+            compact_masks=True,
+        )
+
+        det_dense = slicer_dense(image)
+        det_compact = slicer_compact(image)
+
+        assert len(det_dense) == len(det_compact)
+
+        dense_masks = det_dense.mask
+        compact_masks_arr = np.asarray(det_compact.mask)
+
+        # Sort both by xyxy to align order (NMS order may differ).
+        def _sort_key(d: Detections) -> np.ndarray:
+            return d.xyxy[:, 0] * 10000 + d.xyxy[:, 1]
+
+        order_d = np.argsort(_sort_key(det_dense))
+        order_c = np.argsort(_sort_key(det_compact))
+
+        np.testing.assert_array_equal(
+            dense_masks[order_d],
+            compact_masks_arr[order_c],
+            err_msg="compact_masks pipeline produced different mask pixels than dense",
+        )
+
+    def test_nms_with_overlapping_tiles_uses_rle_iou(self) -> None:
+        """With overlapping tiles, NMS must suppress duplicates using RLE IoU."""
+        image = np.zeros((300, 300, 3), dtype=np.uint8)
+
+        call_count = 0
+
+        def counting_callback(tile: np.ndarray) -> Detections:
+            nonlocal call_count
+            call_count += 1
+            return _fake_seg_callback(tile)
+
+        slicer = sv.InferenceSlicer(
+            callback=counting_callback,
+            slice_wh=200,
+            overlap_wh=100,  # heavy overlap → many duplicate detections
+            overlap_filter=sv.OverlapFilter.NON_MAX_SUPPRESSION,
+            iou_threshold=0.3,
+            compact_masks=True,
+        )
+        result = slicer(image)
+
+        assert call_count > 1, "Should have run on multiple tiles"
+        assert isinstance(result.mask, CompactMask), (
+            "Result mask must remain CompactMask after cross-tile NMS"
+        )
+
+    def test_no_mask_callback_unaffected(self) -> None:
+        """compact_masks=True must not crash when callback returns no masks."""
+
+        def box_only_callback(tile: np.ndarray) -> Detections:
+            h, w = tile.shape[:2]
+            return Detections(
+                xyxy=np.array([[0, 0, w // 2, h // 2]], dtype=np.float32),
+                confidence=np.array([0.9]),
+                class_id=np.array([0]),
+            )
+
+        image = np.zeros((200, 200, 3), dtype=np.uint8)
+        slicer = sv.InferenceSlicer(
+            callback=box_only_callback,
+            slice_wh=200,
+            overlap_wh=0,
+            overlap_filter=sv.OverlapFilter.NONE,
+            compact_masks=True,
+        )
+        result = slicer(image)
+        assert result.mask is None