Skip to content

Honor --num_workers in the eval dataloader#60

Open
TonyChen06 wants to merge 1 commit into
ELM-Research:mainfrom
TonyChen06:perf/eval-loader-workers
Open

Honor --num_workers in the eval dataloader#60
TonyChen06 wants to merge 1 commit into
ELM-Research:mainfrom
TonyChen06:perf/eval-loader-workers

Conversation

@TonyChen06

Copy link
Copy Markdown
Contributor

First piece of splitting up #12 into simpler PRs, rebuilt against current main.

What

The eval DataLoader ignored --num_workers, so all per-item work (ECG loading, normalization, matplotlib rendering for the rgb representation, tokenization) ran on the main process between generate() calls. This passes num_workers and persistent_workers through, exactly like the train loader. Default behavior (--num_workers 0) is unchanged.

Verification

  • Items identical between num_workers=0 and num_workers=2 (tensor-equal elm_input_ids and CLIP pixel tensors over 8 batches, rgb representation).
  • Loading 8 rgb eval batches: 13.5 s → 6.2 s with 2 workers.

The eval DataLoader ignored --num_workers, so all per-item work
(ECG loading, normalization, image rendering for the rgb
representation, tokenization) ran on the main process between
generate() calls. Pass num_workers and persistent_workers through,
matching the train loader.

Items are unchanged (verified identical between num_workers=0 and 2);
loading 8 rgb eval batches drops from 13.5s to 6.2s with 2 workers.
The default (--num_workers 0) behaves exactly as before.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant