fix(synthetic): align _SyntheticTextExamplesIterable with HuggingFace…#645
fix(synthetic): align _SyntheticTextExamplesIterable with HuggingFace…#645arturofredes wants to merge 1 commit intovllm-project:mainfrom
Conversation
… datasets API - Add n_shards property (alias of num_shards). HuggingFace IterableDataset expects n_shards; without it, loading synthetic data raises NotImplementedError. - Change shard_data_sources(worker_id, num_workers) to match the base _BaseExamplesIterable API; the previous signature (num_shards, index, contiguous) caused 'unexpected keyword argument worker_id' when using multiple DataLoader workers. Fixes synthetic data loader when used with datasets.IterableDataset. Signed-off-by: Arturo Fredes <arturofredesc@gmail.com>
|
Can you tell us what version of the datasets package you're using? The latest, 4.8.2, aligns with the code in the current GuideLLM. The change you're proposing seems to revert a GuideLLM currently depends on the Is there another constraint requiring you to use an old version of |
… datasets API
Summary
Details
Test Plan
Related Issues
Use of AI
## WRITTEN BY AI ##)