Conversation
dattri/algorithm/tracin.py
Outdated
| } | ||
|
|
||
|
|
||
| class DataloaderGroup: |
There was a problem hiding this comment.
inherit torch.utils.data.DataLoader
dattri/algorithm/tracin.py
Outdated
| """Initialize the DataloaderGroup. | ||
|
|
||
| Args: | ||
| original_test_dataloader (DataLoader): The underlying PyTorch dataloader. |
There was a problem hiding this comment.
- for individual test data samples
dattri/algorithm/tracin.py
Outdated
| original_test_dataloader (DataLoader): The underlying PyTorch dataloader. | ||
| """ | ||
| self.original_test_dataloader = original_test_dataloader | ||
| self.batch_size = 1 |
There was a problem hiding this comment.
If the batch_size is not used, then we can delete this member.
dattri/algorithm/tracin.py
Outdated
| """ | ||
| self.original_test_dataloader = original_test_dataloader | ||
| self.batch_size = 1 | ||
| self.sampler = [0] |
There was a problem hiding this comment.
After inherit dataloader class, we don't need this member
dattri/algorithm/tracin.py
Outdated
| else: | ||
| temp = sub_batch.to(self.device) | ||
|
|
||
| sub_grad = torch.nan_to_num(self.grad_target_func(parameters, temp)) |
There was a problem hiding this comment.
We may want to have a slighly different user-defined loss/target function (only shown in example) and avoid the changes in attributor.attribute()
| def get_param(self, *args, **kwargs): return dict(model.named_parameters()), None | ||
| def get_grad_loss_func(self, *args, **kwargs): return func | ||
| def get_grad_target_func(self, *args, **kwargs): | ||
| return func_group |
There was a problem hiding this comment.
For examples, we need to create a task using AttributionTask API.
There was a problem hiding this comment.
But with a different target function (compared to other examples).
TheaperDeng
left a comment
There was a problem hiding this comment.
Please change accordingly
dattri/algorithm/tracin.py
Outdated
| if hasattr(test_dataloader, "original_test_dataloader"): | ||
| _check_shuffle(test_dataloader.original_test_dataloader) | ||
| else: | ||
| _check_shuffle(test_dataloader) |
| input_dim, n_train, n_test = 2, 10, 5 | ||
|
|
||
| model = nn.Linear(input_dim, 1, bias=False) | ||
| model.weight.data.fill_(1.0) |
dattri/task.py
Outdated
| group_target_func (Callable): Optional. When attributing to a group (e.g. a | ||
| DataLoader passed via DataloaderGroup), this scalar function is used | ||
| instead of the per-sample target. Signature (params_dict, loader) -> scalar. | ||
| The gradient of this w.r.t. params is the test-side gradient for the group. |
There was a problem hiding this comment.
Let's make it a bool option, default to False. And please change the docstring of target_func, i.e., "when group_target_func=True, it should take the parameters ..."
dattri/task.py
Outdated
| g = grad(flat_group_target)(parameters) | ||
| return g.unsqueeze(0) | ||
|
|
||
| return base_grad_target(parameters, data) |
There was a problem hiding this comment.
Please still call return self.grad_target_func if group_target_func=False and only return a separately wrapped function if group_target_func=True
| normalized_grad=False, | ||
| device=args.device, | ||
| ) | ||
| attributor.projector_kwargs = None |
There was a problem hiding this comment.
Why do we need this line?
| model=model, | ||
| checkpoints=model.state_dict(), | ||
| target_func=f, | ||
| group_target_func=group_target_func, |
There was a problem hiding this comment.
...
target_func=group_target_func
group_target_func=True
...| print(f"Calculated Scores (first 10):\n{scores.flatten()[:10]}") | ||
| print(f"Calculated Scores Temp sum over test (first 10):\n{scores_temp.sum(dim=1)[:10]}") | ||
| diff = (scores.flatten() - scores_temp.sum(dim=1)).abs() | ||
| print(f"Max |group - sum(per-test)|: {diff.max().item():.6f}") |
There was a problem hiding this comment.
What's the output of this script? Could you paste it here?
There was a problem hiding this comment.
Test Dataloader Group (AttributionTask + group_target_func=True) — MNIST + MLP.
Score Shape: torch.Size([10000, 1])
Calculated Scores (first 10):
tensor([-2.2991e+00, 1.0665e-04, -1.4294e-01, 9.3012e-05, 1.6025e-01,
-2.3018e-02, 8.6976e-06, 1.5331e-07, -1.1255e-02, 8.6521e-07])
Calculated Scores Temp sum over test (first 10):
tensor([-2.2992e+00, 1.0665e-04, -1.4294e-01, 9.3012e-05, 1.6025e-01,
-2.3017e-02, 8.6975e-06, 1.5331e-07, -1.1255e-02, 8.6522e-07])
Max |group - sum(per-test)|: 0.005127
|
Please also fix the lint error |
Description
Add Tracin test_dataloader_group to support memory efficient computation.
Add example/test of test_dataloader_group.
1. Motivation and Context
Add an alternative way to get score matrix with less memory used
2. Summary of the change
Add TestDataloaderGroup class to TracIn.py
Add test to test_tracin.py
Add usage example
Update GitHub example workflow
3. What tests have been added/updated for the change?