Tracin group attribute by Soulknight-T · Pull Request #245 · TRAIS-Lab/dattri

Soulknight-T · 2026-02-10T00:32:50Z

Description

Add Tracin test_dataloader_group to support memory efficient computation.
Add example/test of test_dataloader_group.

1. Motivation and Context

Add an alternative way to get score matrix with less memory used

2. Summary of the change

Add TestDataloaderGroup class to TracIn.py
Add test to test_tracin.py
Add usage example
Update GitHub example workflow

3. What tests have been added/updated for the change?

Unit test: Typically, this should be included if you implemented a new function/fixed a bug.

.github/workflows/examples_test.yml

TheaperDeng · 2026-02-16T22:03:05Z

dattri/algorithm/tracin.py

 }


+class DataloaderGroup:


inherit torch.utils.data.DataLoader

TheaperDeng · 2026-02-16T22:04:50Z

dattri/algorithm/tracin.py

+        """Initialize the DataloaderGroup.
+
+        Args:
+            original_test_dataloader (DataLoader): The underlying PyTorch dataloader.


for individual test data samples

TheaperDeng · 2026-02-16T22:05:40Z

dattri/algorithm/tracin.py

+            original_test_dataloader (DataLoader): The underlying PyTorch dataloader.
+        """
+        self.original_test_dataloader = original_test_dataloader
+        self.batch_size = 1


If the batch_size is not used, then we can delete this member.

TheaperDeng · 2026-02-16T22:06:04Z

dattri/algorithm/tracin.py

+        """
+        self.original_test_dataloader = original_test_dataloader
+        self.batch_size = 1
+        self.sampler = [0]


After inherit dataloader class, we don't need this member

TheaperDeng · 2026-02-16T22:12:40Z

dattri/algorithm/tracin.py

+                else:
+                    temp = sub_batch.to(self.device)
+
+                sub_grad = torch.nan_to_num(self.grad_target_func(parameters, temp))


We may want to have a slighly different user-defined loss/target function (only shown in example) and avoid the changes in attributor.attribute()

TheaperDeng · 2026-02-16T22:13:44Z

examples/data_cleaning/tracin_dataloader_group.py

+        def get_param(self, *args, **kwargs): return dict(model.named_parameters()), None
+        def get_grad_loss_func(self, *args, **kwargs): return func
+        def get_grad_target_func(self, *args, **kwargs):
+            return func_group


For examples, we need to create a task using AttributionTask API.

But with a different target function (compared to other examples).

TheaperDeng

Please change accordingly

examples/data_cleaning/tracin_dataloader_group.py

TheaperDeng · 2026-02-23T23:18:32Z

dattri/algorithm/tracin.py

+        if hasattr(test_dataloader, "original_test_dataloader"):
+            _check_shuffle(test_dataloader.original_test_dataloader)
+        else:
+            _check_shuffle(test_dataloader)


Remove this change.

TheaperDeng · 2026-02-23T23:19:10Z

examples/data_cleaning/tracin_dataloader_group.py

+    input_dim, n_train, n_test = 2, 10, 5
+
+    model = nn.Linear(input_dim, 1, bias=False)
+    model.weight.data.fill_(1.0)


Use mnist_mlp

TheaperDeng · 2026-03-03T06:40:11Z

dattri/task.py

+            group_target_func (Callable): Optional. When attributing to a group (e.g. a
+                DataLoader passed via DataloaderGroup), this scalar function is used
+                instead of the per-sample target. Signature (params_dict, loader) -> scalar.
+                The gradient of this w.r.t. params is the test-side gradient for the group.


Let's make it a bool option, default to False. And please change the docstring of target_func, i.e., "when group_target_func=True, it should take the parameters ..."

TheaperDeng · 2026-03-03T06:54:31Z

dattri/task.py

+                g = grad(flat_group_target)(parameters)
+                return g.unsqueeze(0)
+
+            return base_grad_target(parameters, data)


Please still call return self.grad_target_func if group_target_func=False and only return a separately wrapped function if group_target_func=True

TheaperDeng · 2026-03-03T06:57:01Z

examples/data_cleaning/tracin_dataloader_group.py

+        normalized_grad=False,
+        device=args.device,
+    )
+    attributor.projector_kwargs = None


Why do we need this line?

TheaperDeng · 2026-03-03T06:58:17Z

examples/data_cleaning/tracin_dataloader_group.py

+        model=model,
+        checkpoints=model.state_dict(),
+        target_func=f,
+        group_target_func=group_target_func,


... target_func=group_target_func group_target_func=True ...

TheaperDeng · 2026-03-03T06:59:17Z

examples/data_cleaning/tracin_dataloader_group.py

+    print(f"Calculated Scores (first 10):\n{scores.flatten()[:10]}")
+    print(f"Calculated Scores Temp sum over test (first 10):\n{scores_temp.sum(dim=1)[:10]}")
+    diff = (scores.flatten() - scores_temp.sum(dim=1)).abs()
+    print(f"Max |group - sum(per-test)|: {diff.max().item():.6f}")


What's the output of this script? Could you paste it here?

Test Dataloader Group (AttributionTask + group_target_func=True) — MNIST + MLP.
Score Shape: torch.Size([10000, 1])
Calculated Scores (first 10):
tensor([-2.2991e+00, 1.0665e-04, -1.4294e-01, 9.3012e-05, 1.6025e-01,
-2.3018e-02, 8.6976e-06, 1.5331e-07, -1.1255e-02, 8.6521e-07])
Calculated Scores Temp sum over test (first 10):
tensor([-2.2992e+00, 1.0665e-04, -1.4294e-01, 9.3012e-05, 1.6025e-01,
-2.3017e-02, 8.6975e-06, 1.5331e-07, -1.1255e-02, 8.6522e-07])
Max |group - sum(per-test)|: 0.005127

TheaperDeng · 2026-03-04T01:01:00Z

Please also fix the lint error

Tommy Jin added 3 commits February 9, 2026 17:18

add tracin group dataloader example

4d0d236

add test dataloader group

36eed36

add tracin dataloader group test

e2f97ab

Soulknight-T marked this pull request as draft February 10, 2026 01:01

update dataloader_group

3e49baa

Soulknight-T marked this pull request as ready for review February 10, 2026 01:08

fix syntax error

c2334fe

TheaperDeng requested changes Feb 16, 2026

View reviewed changes

TheaperDeng changed the title ~~Tracin group attribute~~ [WIP] Tracin group attribute Feb 16, 2026

Tommy Jin added 3 commits February 22, 2026 19:01

add dataloader group class to unit test

dbc6644

Move dataloader group class to Example/Test

333f304

Use AttributeTask; Added dataloader group class

ac2ad9c

TheaperDeng reviewed Feb 23, 2026

View reviewed changes

jj39 added 3 commits February 27, 2026 16:26

removed check_shuffle logic change

6813189

add an optional group_target_func to avoid vmap issue with Dataloader

3601806

Using mnist-mlp

ee4f778

TheaperDeng requested changes Mar 3, 2026

View reviewed changes

TheaperDeng changed the title ~~[WIP] Tracin group attribute~~ Tracin group attribute Mar 3, 2026

jj39 and others added 6 commits March 6, 2026 21:31

add optional boolean argument group_target_func

f3209ab

Add optional argument group_target_func

73be7fc

adapt task.py change

3f46a13

adapt task.py change

5ced133

Merge branch 'main' into tracin_group_attribute

82ebabb

adapt updated projector

2701ac9

Conversation

Soulknight-T commented Feb 10, 2026

Description

1. Motivation and Context

2. Summary of the change

3. What tests have been added/updated for the change?

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TheaperDeng left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TheaperDeng commented Mar 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants