Skip to content

Optimize exstruct.engine lazy imports for issue #109#113

Merged
harumiWeb merged 1 commit intomainfrom
feat/cli-optimize
Mar 21, 2026
Merged

Optimize exstruct.engine lazy imports for issue #109#113
harumiWeb merged 1 commit intomainfrom
feat/cli-optimize

Conversation

@harumiWeb
Copy link
Copy Markdown
Owner

@harumiWeb harumiWeb commented Mar 21, 2026

Summary

  • defer heavy runtime imports in exstruct.engine via lazy compatibility wrappers
  • add regression coverage for lightweight exstruct.engine and ExStructEngine imports
  • document the engine import boundary in the architecture overview

Testing

  • uv run pytest tests/cli/test_cli_lazy_imports.py tests/engine/test_engine.py tests/engine/test_engine_alpha_col.py tests/backends/test_auto_page_breaks.py tests/core/test_mode_output.py -m "not com" -q
  • uv run task precommit-run
  • uv run task test-unit

Fixes #109


Open with Devin

Summary by CodeRabbit

Documentation

  • Updated architecture documentation clarifying module import constraints and lazy-loading guarantees for core components.

Tests

  • Added verification tests for module import-time behavior to ensure dependencies load only when needed.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 21, 2026

📝 Walkthrough

Walkthrough

The changes implement lazy-loading wrapper functions in exstruct.engine to defer imports of heavy extraction/rendering dependencies until invoked. Nine public functions now wrap their implementations with on-demand imports. Documentation is updated to reflect this import constraint, and tests verify that importing the engine module alone avoids loading heavy transitive dependencies.

Changes

Cohort / File(s) Summary
Documentation Updates
dev-docs/architecture/overview.md
Updated architecture documentation to explicitly reference exstruct.engine in side-effect-free import constraints and emphasize deferring heavy extraction/edit implementation imports until command execution.
Lazy Import Wrappers
src/exstruct/engine.py
Replaced direct top-level imports of extraction, serialization, and rendering functions with lightweight wrapper functions that lazily import implementations on demand. Added 9 new public wrapper functions: set_table_detection_params, extract_workbook, convert_workbook_keys_to_alpha, serialize_workbook, save_sheets, save_print_area_views, save_auto_page_break_views, export_pdf, and export_sheet_images. Updated table detection config access to use lazy-loaded module reference.
Lazy Import Tests
tests/cli/test_cli_lazy_imports.py
Added two new test functions verifying that importing exstruct.engine directly and accessing ExStructEngine from the package root do not eagerly load heavy submodules (exstruct.core.cells, exstruct.core.integrate, exstruct.io, exstruct.render) or third-party dependencies (numpy, pandas, openpyxl, xlwings, PIL).

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 A rabbit hops through imports light,
Deferring loads till runtime's right,
Wrappers dance with lazy grace,
Heavy libs stay out of place—
Fast begins the journey's flight!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 73.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The PR title clearly describes the main change: optimizing exstruct.engine lazy imports to resolve issue #109.
Description check ✅ Passed The description addresses the scope (lazy imports, regression tests, documentation), links issue #109, and provides testing commands, but uses a template mismatched for this PR.
Linked Issues check ✅ Passed All key acceptance criteria from issue #109 are met: lazy imports reduce import overhead, public API compatibility is preserved through wrapper functions, CLI startup cost is reduced, and type-checking remains usable.
Out of Scope Changes check ✅ Passed All changes align with issue #109 objectives: lazy wrapper functions in engine.py, import-time optimization documentation, and regression tests for lightweight imports.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/cli-optimize

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Tip

You can disable poems in the walkthrough.

Disable the reviews.poem setting to disable the poems in the walkthrough.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes the exstruct.engine import path to stay lightweight by deferring extraction/IO/render-related imports until runtime, aligning with the project’s “side-effect-free startup” goals and issue #109.

Changes:

  • Replaced eager imports in src/exstruct/engine.py with lazy wrapper functions for extraction, serialization, and rendering entry points.
  • Added regression tests ensuring import exstruct.engine and from exstruct import ExStructEngine do not eagerly load heavy dependencies/modules.
  • Documented exstruct.engine as part of the lightweight import boundary in the architecture overview.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
tests/cli/test_cli_lazy_imports.py Adds probes asserting exstruct.engine / ExStructEngine imports don’t pull in core/io/render or heavy deps.
src/exstruct/engine.py Introduces lazy proxy wrappers and removes module-level heavy imports to keep import-time lightweight.
dev-docs/architecture/overview.md Updates architecture guidance to include exstruct.engine in the lightweight/side-effect-free import boundary.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 4 additional findings.

Open in Devin Review

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
dev-docs/architecture/overview.md (1)

101-105: Add engine.py to the top-level module map.

This paragraph now treats exstruct.engine as a first-class boundary, but the structure tree above still omits engine.py, so the overview is slightly out of sync for readers scanning the file layout.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@dev-docs/architecture/overview.md` around lines 101 - 105, The module map in
the overview is missing engine.py; update the top-level module list to include
engine.py as exstruct.engine so the documented structure matches the paragraph
that treats exstruct.engine as a first-class boundary; ensure the entry is
listed alongside exstruct.__init__ and exstruct.edit.__init__ so readers
scanning the file layout see engine.py referenced consistently with the text.
src/exstruct/engine.py (1)

23-217: Preserve the richer public docs on these wrappers.

The signatures stay compatible, but help() / generated API docs for exstruct.engine.extract_workbook, serialize_workbook, and the other wrapper entrypoints now collapse to proxy one-liners. Consider copying over the underlying public docs so the lazy indirection does not reduce discoverability.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/exstruct/engine.py` around lines 23 - 217, The wrapper functions (e.g.,
extract_workbook, serialize_workbook, save_sheets, save_print_area_views,
save_auto_page_break_views, export_pdf, export_sheet_images,
convert_workbook_keys_to_alpha, set_table_detection_params) currently have
one-line docstrings that hide the richer docs from their underlying
implementations; update each wrapper to copy the implementation's __doc__ so
help() and generated docs show the original API details—do this by performing
the same lazy import used in the function (e.g., import extract_workbook_impl
from .core.integrate inside extract_workbook) and then assign the wrapper's
__doc__ = extract_workbook_impl.__doc__ (or use functools.wraps on the wrapper
where appropriate) so the public docstrings are preserved without changing the
lazy import behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@dev-docs/architecture/overview.md`:
- Around line 101-105: The module map in the overview is missing engine.py;
update the top-level module list to include engine.py as exstruct.engine so the
documented structure matches the paragraph that treats exstruct.engine as a
first-class boundary; ensure the entry is listed alongside exstruct.__init__ and
exstruct.edit.__init__ so readers scanning the file layout see engine.py
referenced consistently with the text.

In `@src/exstruct/engine.py`:
- Around line 23-217: The wrapper functions (e.g., extract_workbook,
serialize_workbook, save_sheets, save_print_area_views,
save_auto_page_break_views, export_pdf, export_sheet_images,
convert_workbook_keys_to_alpha, set_table_detection_params) currently have
one-line docstrings that hide the richer docs from their underlying
implementations; update each wrapper to copy the implementation's __doc__ so
help() and generated docs show the original API details—do this by performing
the same lazy import used in the function (e.g., import extract_workbook_impl
from .core.integrate inside extract_workbook) and then assign the wrapper's
__doc__ = extract_workbook_impl.__doc__ (or use functools.wraps on the wrapper
where appropriate) so the public docstrings are preserved without changing the
lazy import behavior.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 65335363-9dac-4366-80b3-d26b70aa091a

📥 Commits

Reviewing files that changed from the base of the PR and between c0c8c52 and f0051ea.

📒 Files selected for processing (3)
  • dev-docs/architecture/overview.md
  • src/exstruct/engine.py
  • tests/cli/test_cli_lazy_imports.py

@harumiWeb harumiWeb merged commit b50447c into main Mar 21, 2026
16 checks passed
@harumiWeb harumiWeb deleted the feat/cli-optimize branch March 21, 2026 09:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Package import optimization: lighten exstruct.__init__ and introduce lazy exports

2 participants