feat: add show_progress option to dataset loaders and savers by satishkc7 · Pull Request #2181 · roboflow/supervision

satishkc7 · 2026-03-24T20:32:13Z

Summary

Closes #183

Adds an optional show_progress parameter to all time-consuming dataset operations so users can see loading/saving progress via a tqdm progress bar.

DetectionDataset.from_coco(show_progress=True)
DetectionDataset.from_pascal_voc(show_progress=True)
DetectionDataset.from_yolo(show_progress=True)
DetectionDataset.as_coco(show_progress=True)
DetectionDataset.as_yolo(show_progress=True)
DetectionDataset.as_pascal_voc(show_progress=True)

Details

Defaults to False - fully backward compatible, no existing code breaks
Uses tqdm.auto (already a project dependency) so progress bars work correctly in both terminal and Jupyter notebook environments
The parameter is propagated from the public DetectionDataset methods down to the internal format loader/saver functions

Test plan

Load a COCO dataset with show_progress=True and confirm the bar renders
Load a YOLO dataset with show_progress=True and confirm the bar renders
Load a Pascal VOC dataset with show_progress=True and confirm the bar renders
Confirm existing tests pass with no changes (default show_progress=False)

CLAassistant · 2026-03-24T20:32:20Z

All committers have signed the CLA.

Add optional tqdm progress bars to all time-consuming dataset operations. Addresses roboflow#183. - load_coco_annotations / DetectionDataset.from_coco - load_pascal_voc_annotations / DetectionDataset.from_pascal_voc - load_yolo_annotations / DetectionDataset.from_yolo - save_dataset_images / DetectionDataset.as_coco / as_yolo / as_pascal_voc The show_progress parameter defaults to False for full backward compatibility. Uses tqdm.auto so progress bars work in both terminal and Jupyter notebook environments.

codecov · 2026-03-26T13:12:40Z

Codecov Report

❌ Patch coverage is 50.84746% with 29 lines in your changes missing coverage. Please review.
✅ Project coverage is 76%. Comparing base (d94db74) to head (c9201d0).
⚠️ Report is 1 commits behind head on develop.

❌ Your patch check has failed because the patch coverage (51%) is below the target coverage (95%). You can increase the patch coverage or adjust the target coverage.
❌ Your project check has failed because the head coverage (76%) is below the target coverage (95%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files

@@           Coverage Diff           @@
##           develop   #2181   +/-   ##
=======================================
- Coverage       76%     76%   -0%     
=======================================
  Files           62      62           
  Lines         7547    7561   +14     
=======================================
+ Hits          5714    5722    +8     
- Misses        1833    1839    +6

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copilot

Pull request overview

Adds an optional show_progress: bool = False flag to dataset load/export APIs so long-running dataset operations can display a tqdm progress bar (via tqdm.auto) without breaking existing callers.

Changes:

Add show_progress to DetectionDataset.from_* and DetectionDataset.as_* public methods and propagate it into format-specific loaders.
Wrap COCO/YOLO/Pascal VOC annotation loading loops with tqdm progress bars (disabled by default).
Add show_progress support to save_dataset_images (image export) with a tqdm progress bar.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
src/supervision/dataset/utils.py	Adds `show_progress` to `save_dataset_images` and wraps image export in `tqdm`.
src/supervision/dataset/formats/yolo.py	Adds `show_progress` to YOLO loader and wraps per-image annotation processing in `tqdm`.
src/supervision/dataset/formats/pascal_voc.py	Adds `show_progress` to Pascal VOC loader and wraps per-image annotation processing in `tqdm`.
src/supervision/dataset/formats/coco.py	Adds `show_progress` to COCO loader and wraps per-image annotation processing in `tqdm`.
src/supervision/dataset/core.py	Exposes `show_progress` on `DetectionDataset.from_`/`as_` and forwards it into image saving + loaders.

Comments suppressed due to low confidence (3)

src/supervision/dataset/core.py:567

show_progress is only applied to save_dataset_images(...) here; exporting annotations (and even data.yaml) can be the slow part for large datasets, and currently has no progress indication. Consider propagating show_progress into save_yolo_annotations/save_data_yaml (and adding tqdm there), or rename/clarify the parameter/docs so users don’t expect progress when only saving annotations.

        if images_directory_path is not None:
            save_dataset_images(
                dataset=self,
                images_directory_path=images_directory_path,
                show_progress=show_progress,
            )
        if annotations_directory_path is not None:
            save_yolo_annotations(
                dataset=self,
                annotations_directory_path=annotations_directory_path,
                min_image_area_percentage=min_image_area_percentage,
                max_image_area_percentage=max_image_area_percentage,
                approximation_percentage=approximation_percentage,
            )
        if data_yaml_path is not None:
            save_data_yaml(data_yaml_path=data_yaml_path, classes=self.classes)

src/supervision/dataset/core.py:680

show_progress is only wired to image saving; save_coco_annotations(...) can be time-consuming on large datasets but gets no progress indication. Consider adding show_progress support to save_coco_annotations (and forwarding it from here) so as_coco(show_progress=True) provides feedback even when only saving annotations.

        if images_directory_path is not None:
            save_dataset_images(
                dataset=self,
                images_directory_path=images_directory_path,
                show_progress=show_progress,
            )
        if annotations_path is not None:
            save_coco_annotations(
                dataset=self,
                annotation_path=annotations_path,
                min_image_area_percentage=min_image_area_percentage,
                max_image_area_percentage=max_image_area_percentage,
                approximation_percentage=approximation_percentage,
            )

src/supervision/dataset/core.py:388

as_pascal_voc(..., show_progress=...) currently only shows progress for image saving; when exporting only annotations (or when annotations dominate runtime), users won’t see any progress. Consider wrapping the annotations export loop in a tqdm controlled by show_progress (or adjusting the parameter/docs to reflect that it only applies to image saving).

        if images_directory_path:
            save_dataset_images(
                dataset=self,
                images_directory_path=images_directory_path,
                show_progress=show_progress,
            )
        if annotations_directory_path:
            Path(annotations_directory_path).mkdir(parents=True, exist_ok=True)
            for image_path, image, annotations in self:
                annotation_name = Path(image_path).stem
                annotations_path = os.path.join(
                    annotations_directory_path, f"{annotation_name}.xml"
                )
                image_name = Path(image_path).name
                pascal_voc_xml = detections_to_pascal_voc(
                    detections=annotations,
                    classes=self.classes,
                    filename=image_name,
                    image_shape=image.shape,
                    min_image_area_percentage=min_image_area_percentage,
                    max_image_area_percentage=max_image_area_percentage,
                    approximation_percentage=approximation_percentage,
                )

                with open(annotations_path, "w") as f:
                    f.write(pascal_voc_xml)

Copilot · 2026-03-26T13:17:17Z

src/supervision/dataset/formats/yolo.py

+                    f"Images must be 'RGB' or 'grayscale', \
                but {image_path} mode is '{image.mode}'."


The ValueError message here includes a literal backslash + newline/indentation because the string is split across lines ("..., \n but ..."). This makes the error message noisy and hard to read; please format this as a single-line string (or use implicit string concatenation without embedding a backslash/newline).

Suggested change

f"Images must be 'RGB' or 'grayscale', \

but {image_path} mode is '{image.mode}'."

f"Images must be 'RGB' or 'grayscale', but {image_path} mode is '{image.mode}'."

Copilot · 2026-03-26T13:17:17Z

src/supervision/dataset/formats/yolo.py

+            image = Image.open(image_path)
+            lines = read_txt_file(file_path=annotation_path, skip_empty=True)
+            w, h = image.size
+            resolution_wh = (w, h)
+            if image.mode not in ("RGB", "L"):
+                raise ValueError(
+                    f"Images must be 'RGB' or 'grayscale', \
                but {image_path} mode is '{image.mode}'."
+                )


Image.open(image_path) keeps the underlying file handle open until the image is closed. In a large dataset loop this can exhaust file descriptors; use a context manager (with Image.open(...) as img:) or explicitly close the image after reading size/mode.

Suggested change

image = Image.open(image_path)

lines = read_txt_file(file_path=annotation_path, skip_empty=True)

w, h = image.size

resolution_wh = (w, h)

if image.mode not in ("RGB", "L"):

raise ValueError(

f"Images must be 'RGB' or 'grayscale', \

but {image_path} mode is '{image.mode}'."

)

lines = read_txt_file(file_path=annotation_path, skip_empty=True)

with Image.open(image_path) as image:

w, h = image.size

resolution_wh = (w, h)

if image.mode not in ("RGB", "L"):

raise ValueError(

f"Images must be 'RGB' or 'grayscale', \

but {image_path} mode is '{image.mode}'."

)

Copilot · 2026-03-26T13:17:17Z

src/supervision/dataset/core.py

    @classmethod
    def from_coco(
        cls,
        images_directory_path: str,
        annotations_path: str,
        force_masks: bool = False,
+        show_progress: bool = False,
    ) -> DetectionDataset:


New show_progress code paths (both loaders and exporters) aren’t covered by tests. Please add unit/integration tests that call from_coco/from_yolo/from_pascal_voc and as_coco/as_yolo/as_pascal_voc with show_progress=True (ideally by monkeypatching tqdm.auto.tqdm to a no-op) to ensure the option doesn’t change outputs or raise exceptions.

- Add show_progress param to save_yolo_annotations and save_coco_annotations - Wrap Pascal VOC annotation export loop in as_pascal_voc with tqdm - Pass show_progress through as_yolo and as_coco to their annotation savers - Add test_show_progress.py covering all loaders and savers with both show_progress=True and show_progress=False

…ssage in YOLO loader

satishkc7 · 2026-03-27T16:04:46Z

The Codecov report (51% patch coverage) is from the first commit and is now stale.

Since then, two follow-up commits were pushed:

Propagated show_progress into all annotation savers: save_yolo_annotations, save_coco_annotations, and the Pascal VOC annotation loop in as_pascal_voc
Added tests/dataset/test_show_progress.py with 20 tests covering every loader and saver (both show_progress=True and show_progress=False, plus output consistency checks for each)

All 20 tests pass. Could you re-trigger CI so Codecov picks up the updated coverage? Thanks!

satishkc7 requested a review from SkalskiP as a code owner March 24, 2026 20:32

satishkc7 force-pushed the feat/progress-bars-dataset-ops branch from b9abdbd to c9201d0 Compare March 24, 2026 21:16

Borda requested a review from Copilot March 26, 2026 13:11

Copilot started reviewing on behalf of Borda March 26, 2026 13:12 View session

Copilot AI reviewed Mar 26, 2026

View reviewed changes

Borda self-assigned this Mar 26, 2026

satishkc7 and others added 3 commits March 26, 2026 09:35

fix(pre_commit): 🎨 auto format pre-commit hooks

e1b3220

fix: close Image.open with context manager and clean up ValueError me…

a9c024b

…ssage in YOLO loader

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add show_progress option to dataset loaders and savers#2181

feat: add show_progress option to dataset loaders and savers#2181
satishkc7 wants to merge 4 commits intoroboflow:developfrom
satishkc7:feat/progress-bars-dataset-ops

satishkc7 commented Mar 24, 2026

Uh oh!

CLAassistant commented Mar 24, 2026 •

edited

Loading

Uh oh!

codecov bot commented Mar 26, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 26, 2026

Uh oh!

Copilot AI Mar 26, 2026

Uh oh!

Copilot AI Mar 26, 2026

Uh oh!

satishkc7 commented Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

		f"Images must be 'RGB' or 'grayscale', \
		but {image_path} mode is '{image.mode}'."

	f"Images must be 'RGB' or 'grayscale', \
	but {image_path} mode is '{image.mode}'."
	f"Images must be 'RGB' or 'grayscale', but {image_path} mode is '{image.mode}'."

Conversation

satishkc7 commented Mar 24, 2026

Summary

Details

Test plan

Uh oh!

CLAassistant commented Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

satishkc7 commented Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

CLAassistant commented Mar 24, 2026 •

edited

Loading

codecov bot commented Mar 26, 2026 •

edited

Loading