Optimize exstruct.engine lazy imports for issue #109#113
Conversation
📝 WalkthroughWalkthroughThe changes implement lazy-loading wrapper functions in Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment Tip You can disable poems in the walkthrough.Disable the |
There was a problem hiding this comment.
Pull request overview
This PR optimizes the exstruct.engine import path to stay lightweight by deferring extraction/IO/render-related imports until runtime, aligning with the project’s “side-effect-free startup” goals and issue #109.
Changes:
- Replaced eager imports in
src/exstruct/engine.pywith lazy wrapper functions for extraction, serialization, and rendering entry points. - Added regression tests ensuring
import exstruct.engineandfrom exstruct import ExStructEnginedo not eagerly load heavy dependencies/modules. - Documented
exstruct.engineas part of the lightweight import boundary in the architecture overview.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| tests/cli/test_cli_lazy_imports.py | Adds probes asserting exstruct.engine / ExStructEngine imports don’t pull in core/io/render or heavy deps. |
| src/exstruct/engine.py | Introduces lazy proxy wrappers and removes module-level heavy imports to keep import-time lightweight. |
| dev-docs/architecture/overview.md | Updates architecture guidance to include exstruct.engine in the lightweight/side-effect-free import boundary. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
🧹 Nitpick comments (2)
dev-docs/architecture/overview.md (1)
101-105: Addengine.pyto the top-level module map.This paragraph now treats
exstruct.engineas a first-class boundary, but the structure tree above still omitsengine.py, so the overview is slightly out of sync for readers scanning the file layout.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@dev-docs/architecture/overview.md` around lines 101 - 105, The module map in the overview is missing engine.py; update the top-level module list to include engine.py as exstruct.engine so the documented structure matches the paragraph that treats exstruct.engine as a first-class boundary; ensure the entry is listed alongside exstruct.__init__ and exstruct.edit.__init__ so readers scanning the file layout see engine.py referenced consistently with the text.src/exstruct/engine.py (1)
23-217: Preserve the richer public docs on these wrappers.The signatures stay compatible, but
help()/ generated API docs forexstruct.engine.extract_workbook,serialize_workbook, and the other wrapper entrypoints now collapse to proxy one-liners. Consider copying over the underlying public docs so the lazy indirection does not reduce discoverability.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/exstruct/engine.py` around lines 23 - 217, The wrapper functions (e.g., extract_workbook, serialize_workbook, save_sheets, save_print_area_views, save_auto_page_break_views, export_pdf, export_sheet_images, convert_workbook_keys_to_alpha, set_table_detection_params) currently have one-line docstrings that hide the richer docs from their underlying implementations; update each wrapper to copy the implementation's __doc__ so help() and generated docs show the original API details—do this by performing the same lazy import used in the function (e.g., import extract_workbook_impl from .core.integrate inside extract_workbook) and then assign the wrapper's __doc__ = extract_workbook_impl.__doc__ (or use functools.wraps on the wrapper where appropriate) so the public docstrings are preserved without changing the lazy import behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@dev-docs/architecture/overview.md`:
- Around line 101-105: The module map in the overview is missing engine.py;
update the top-level module list to include engine.py as exstruct.engine so the
documented structure matches the paragraph that treats exstruct.engine as a
first-class boundary; ensure the entry is listed alongside exstruct.__init__ and
exstruct.edit.__init__ so readers scanning the file layout see engine.py
referenced consistently with the text.
In `@src/exstruct/engine.py`:
- Around line 23-217: The wrapper functions (e.g., extract_workbook,
serialize_workbook, save_sheets, save_print_area_views,
save_auto_page_break_views, export_pdf, export_sheet_images,
convert_workbook_keys_to_alpha, set_table_detection_params) currently have
one-line docstrings that hide the richer docs from their underlying
implementations; update each wrapper to copy the implementation's __doc__ so
help() and generated docs show the original API details—do this by performing
the same lazy import used in the function (e.g., import extract_workbook_impl
from .core.integrate inside extract_workbook) and then assign the wrapper's
__doc__ = extract_workbook_impl.__doc__ (or use functools.wraps on the wrapper
where appropriate) so the public docstrings are preserved without changing the
lazy import behavior.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 65335363-9dac-4366-80b3-d26b70aa091a
📒 Files selected for processing (3)
dev-docs/architecture/overview.mdsrc/exstruct/engine.pytests/cli/test_cli_lazy_imports.py
Summary
exstruct.enginevia lazy compatibility wrappersexstruct.engineandExStructEngineimportsTesting
uv run pytest tests/cli/test_cli_lazy_imports.py tests/engine/test_engine.py tests/engine/test_engine_alpha_col.py tests/backends/test_auto_page_breaks.py tests/core/test_mode_output.py -m "not com" -quv run task precommit-runuv run task test-unitFixes #109
Summary by CodeRabbit
Documentation
Tests