Add get_training_input to storage_utils #438

kmontemayor2-sc · 2026-01-16T18:58:45Z

Scope of work done

Add server-side util so we can remotely fetch the training input.

Since this is kind of a big PR not adding the client-side equivalent in this one :P

Again this is server-side code, and it's really meant to be called by users.

Where is the documentation for this feature?: N/A

Did you add automated tests or write a test plan?

Updated Changelog.md? NO

Ready for code review?: NO

mkolodner-sc

Thanks Kyle! Did a pass here and left some comments/questions

mkolodner-sc · 2026-01-20T22:57:39Z

python/gigl/distributed/graph_store/storage_utils.py

+
+
+def get_training_input(
+    split: Union[Literal["train", "val", "test"], str],


Is this sufficient? What if we want "all" nodes (i.e. dataset.node_ids)

That wouldn't be "training"input" would it? We can add another function to do that in the future (get_all_nodes?)

It could be for the random negative loader for link prediction training, which would need some dataset.node_ids or equivalent.

We can add another function to do that in the future (get_all_nodes?)

We wouldn't need a whole different function, we could just specify some 'all' split and if its that we use _dataset.node_ids.

Hmmmm, I do think that the tuple[Tensor, Tensor, Tensor | None] is important for the ABLP.

I guess I could rename this to get_ablp_input or something? Would that ameliorate your concerns?

Discussed offline, renaming will be fine here for this function, and in a follow-up we will refactor the get_node_ids_on_rank utility so that it can be used to split, making it extendable for the SNC use case, and can be called in this function to reduce the duplicity. Can we add a TODO here in the meantime?

python/gigl/distributed/graph_store/storage_utils.py

python/tests/test_assets/distributed/utils.py

python/tests/unit/distributed/graph_store/storage_utils_test.py

mkolodner-sc · 2026-01-21T18:10:01Z

python/gigl/distributed/graph_store/storage_utils.py

+
+
+def get_training_input(
+    split: Union[Literal["train", "val", "test"], str],


It could be for the random negative loader for link prediction training, which would need some dataset.node_ids or equivalent.

We can add another function to do that in the future (get_all_nodes?)

We wouldn't need a whole different function, we could just specify some 'all' split and if its that we use _dataset.node_ids.

python/gigl/distributed/graph_store/storage_utils.py

python/tests/unit/distributed/graph_store/storage_utils_test.py

mkolodner-sc

Thanks Kyle for the work!

kmonte added 2 commits January 16, 2026 18:55

Add get_training_input to storage_utils

b5038e5

fix

3b62990

mkolodner-sc reviewed Jan 20, 2026

View reviewed changes

kmonte added 3 commits January 21, 2026 00:41

Merge branch 'main' into kmonte/storage-util-training-input

b392b77

address comments

c9659fc

address comments

7028fa3

mkolodner-sc reviewed Jan 21, 2026

View reviewed changes

kmonte added 2 commits January 21, 2026 19:09

Merge branch 'main' into kmonte/storage-util-training-input

c2860e4

address comments

6659a82

mkolodner-sc approved these changes Jan 21, 2026

View reviewed changes

kmonte added 2 commits January 21, 2026 19:59

address comments

494b844

Merge branch 'main' into kmonte/storage-util-training-input

559f2ed



		def get_training_input(
		split: Union[Literal["train", "val", "test"], str],

Add get_training_input to storage_utils #438

Are you sure you want to change the base?

Add get_training_input to storage_utils #438

Conversation

kmontemayor2-sc commented Jan 16, 2026

Uh oh!

mkolodner-sc left a comment

Choose a reason for hiding this comment

Uh oh!

mkolodner-sc Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

kmontemayor2-sc Jan 21, 2026

Choose a reason for hiding this comment

Uh oh!

mkolodner-sc Jan 21, 2026

Choose a reason for hiding this comment

Uh oh!

kmontemayor2-sc Jan 21, 2026

Choose a reason for hiding this comment

Uh oh!

mkolodner-sc Jan 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mkolodner-sc Jan 21, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mkolodner-sc left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

mkolodner-sc Jan 21, 2026 •

edited

Loading