Skip to content

feat: replace eager submodule imports with lazy loading#5110

Open
dgandhi62 wants to merge 18 commits into
mainfrom
python-import-planning
Open

feat: replace eager submodule imports with lazy loading#5110
dgandhi62 wants to merge 18 commits into
mainfrom
python-import-planning

Conversation

@dgandhi62
Copy link
Copy Markdown

@dgandhi62 dgandhi62 commented May 12, 2026

Problem

import aws_cdk is slow in python cdk apps because the generated __init__.py eagerly imports all ~300 child submodules via from . import <submodule>. This fixed cost hits every cdk synth, cdk deploy, Lambda cold start, and IDE analysis — regardless of how many services the app actually uses.

Solution

Replace the eager import block with PEP 562 lazy loading using module-level __getattr__ and __dir__.

Before (generated code)

# Loading modules to ensure their types are registered with the jsii runtime library
from . import aws_s3
from . import aws_ec2
from . import aws_lambda
# ... 300+ more

After (generated code)

import importlib as _importlib

_SUBMODULES = {
    "aws_ec2",
    "aws_lambda",
    "aws_s3",
    # ... sorted
}

def __getattr__(name: str) -> object:
    if name in _SUBMODULES:
        mod = _importlib.import_module(f".{name}", __name__)
        globals()[name] = mod
        return mod
    raise AttributeError(f"module {__name__!r} has no attribute {name!r}")

def __dir__() -> "list[str]":
    return [*__all__, *_SUBMODULES]

Issues Encountered and Resolved after the Design Doc

Five issues were discovered during implementation. Each required a specific fix beyond the core lazy loading pattern:

Issue 1: pyright rejects list[str] return type on __dir__

Problem: The pyright test configures pythonVersion = "3.8". With that setting, list[str] is invalid as a runtime annotation (builtin list wasn't subscriptable until 3.9).

Fix: Quote the return type so it's a forward reference, not evaluated at runtime:

def __dir__() -> "list[str]":

Issue 2: pyright flags submodule names in __all__ as undefined

Problem: Pyright performs static analysis. It can't see that __getattr__ will resolve submodule names at runtime, so it reports reportUnsupportedDunderAll for every submodule listed in __all__.

Fix: Emit a typing.TYPE_CHECKING guard with explicit re-exports:

if typing.TYPE_CHECKING:
    from . import aws_s3 as aws_s3
    from . import aws_lambda as aws_lambda

TYPE_CHECKING is True for static analyzers but False at runtime — no cost to lazy loading.

Issue 3: publication.publish() breaks __getattr__ on the public module

Problem: publication.publish() replaces the module in sys.modules with a new ModuleType object that only copies names from __all__. It does NOT copy __getattr__ or __dir__. Since our lazy loading code is defined after publication.publish(), it lives on the original (now-private) module object.

Fix: After defining __getattr__ and __dir__, explicitly install them on the public module:

import sys as _sys
setattr(_sys.modules[__name__], "__getattr__", __getattr__)
setattr(_sys.modules[__name__], "__dir__", __dir__)

Defining it before publish() would not help (from my understanding) because publish() only copies names from __all__. It would not transfer the __getattr__

Issue 4: jsii runtime can't resolve types from unloaded submodules

Problem: With eager imports, all types were registered at import time. With lazy loading, if the jsii kernel returns a type from a submodule that hasn't been imported yet (e.g., a callback returns an object whose type lives in cdk16625.donotimport), the runtime raises Unknown type.

Fix: Added on-demand type resolution in _reference_map.py. When a type FQN isn't found in the registries, the runtime:

  1. Extracts the assembly name from the FQN
  2. Looks up the Python root module (registered during JSIIAssembly.load())
  3. Attempts to import the containing submodule (trying progressively shorter paths)
  4. Retries the type lookup after the import triggers registration

Issue 5: mypy rejects direct assignment to __getattr__ on a module

Problem: _sys.modules[__name__].__getattr__ = __getattr__ triggers mypy's Cannot assign to a method [method-assign] because mypy treats __getattr__ as a special method on ModuleType.

Fix: Use setattr() instead of direct assignment. Same runtime effect, bypasses mypy's check.

Testing

  • New unit tests: 12 tests in lazy-imports.test.ts covering all generated code patterns
  • Snapshot tests: All fixture package snapshots regenerated and passing
  • Static analysis: pyright and mypy pass on all generated code
  • Runtime: jsii-calc fixture exercises nested submodules (cdk16625.donotimport) to verify on-demand type resolution

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@dgandhi62 dgandhi62 changed the title feat: Replace eager submodule imports with PEP 562 lazy loading feat: Replace eager submodule imports with PEP 562 lazy loading [DO NOT MERGE OR REVIEW YET] May 12, 2026
@mergify mergify Bot added the contribution/core This is a PR that came from AWS. label May 12, 2026
@dgandhi62 dgandhi62 force-pushed the python-import-planning branch 3 times, most recently from 46a9cc2 to fe1d535 Compare May 13, 2026 13:17
@dgandhi62 dgandhi62 force-pushed the python-import-planning branch from 7878cdb to 8d0c82c Compare May 20, 2026 17:43
@dgandhi62 dgandhi62 changed the title feat: Replace eager submodule imports with PEP 562 lazy loading [DO NOT MERGE OR REVIEW YET] feat: Replace eager submodule imports with PEP 562 lazy loading May 20, 2026
@dgandhi62 dgandhi62 changed the title feat: Replace eager submodule imports with PEP 562 lazy loading feat: replace eager submodule imports with PEP 562 lazy loading May 20, 2026
@dgandhi62 dgandhi62 changed the title feat: replace eager submodule imports with PEP 562 lazy loading feat: replace eager submodule imports with lazy loading May 20, 2026
@dgandhi62 dgandhi62 force-pushed the python-import-planning branch from 0cc9b13 to a6726f7 Compare May 21, 2026 18:53
dgandhi62 added 18 commits May 22, 2026 10:47
Replace eager "from . import <submodule>" statements in generated
__init__.py files with a PEP 562 lazy loading mechanism using
module-level __getattr__ and __dir__. This defers submodule imports
until first access, dramatically reducing initial import time for
large libraries like aws-cdk-lib.

Generated modules now emit:
- "import importlib as _importlib" (only when submodules exist)
- _SUBMODULES set with sorted submodule short names
- __getattr__ that lazily imports and caches submodules
- __dir__ that returns [*__all__, *_SUBMODULES]

Assembly-loading modules are unaffected (they never have child
submodules, enforced by existing assert in addPythonModule).

All access patterns remain backwards-compatible:
- import aws_cdk.aws_s3 (Python resolves directly)
- from aws_cdk import aws_s3 (triggers __getattr__)
- aws_cdk.aws_s3 (triggers __getattr__)
- from aws_cdk import * (triggers __getattr__ for each __all__ entry)
Add typing.TYPE_CHECKING guard with explicit submodule imports
so pyright can statically see names listed in __all__. Quote
the __dir__ return annotation to avoid reportIndexIssue when
pyright evaluates with pythonVersion < 3.9.
- Add typing.TYPE_CHECKING guard with explicit submodule imports so
  pyright can statically see names in __all__
- Quote __dir__ return annotation to avoid reportIndexIssue
- Install __getattr__/__dir__ on the public module post-publication
  so attribute access works through the publication barrier
- Add on-demand type resolution in _reference_map.py to import
  submodules when the jsii kernel returns unknown types
@dgandhi62 dgandhi62 force-pushed the python-import-planning branch from a6726f7 to ced5b7f Compare May 22, 2026 14:48
lives in an unloaded submodule), this function triggers the import so that
the type self-registers with the runtime.

The FQN format is: ``assembly_name.submodule.path.TypeName``
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately I don't think jsii FQNs map cleanly onto Python import paths like this.

Pretty sure that the author of a jsii module can map whatever submodule they want to whatever Python module path. Plus, there types-in-types, and the last parts of the FQN might not be modules.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's an interesting question: CAN we deterministically find a Python type given a jsii FQN? (All the information should be in the assembly)

Because if we can, we can fully get rid of the registering-types-by-fqns-on-startup business that we have going on here!

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests assert that the generated Python code looks just-so. Those are very brittle, as soon as we change anything about the implementation these tests will break.

Instead, I'd rather test behavior: a Python library generated with this lazy lookup has the following behavior: XYZ (and in fact, probably just "it works the same as it did before" might be good enough 😉 )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contribution/core This is a PR that came from AWS.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants