Skip to content

fix(targets): reuse SQLAlchemy inspector reflection cache#3665

Open
akashmalbari wants to merge 7 commits into
meltano:mainfrom
akashmalbari:fix-sqlconnector-inspector-cache
Open

fix(targets): reuse SQLAlchemy inspector reflection cache#3665
akashmalbari wants to merge 7 commits into
meltano:mainfrom
akashmalbari:fix-sqlconnector-inspector-cache

Conversation

@akashmalbari

@akashmalbari akashmalbari commented Jun 3, 2026

Copy link
Copy Markdown

Closes #3478.

Summary

This updates SQLConnector to reuse a cached SQLAlchemy Inspector instance for repeated reflection calls instead of creating a new sa.inspect(self._engine) for each call.

The affected methods now share the connector-level Inspector:

  • table_exists()
  • schema_exists()
  • get_table_columns()

This preserves SQLAlchemy's Inspector.info_cache across reflection calls, allowing dialect-level reflection caching optimizations to work as intended.

Cache invalidation

Because the Inspector now persists across calls, this also adds a reflection cache clearing helper and clears the cache after connector-managed DDL operations such as:

  • creating schemas
  • creating tables
  • adding columns
  • renaming columns
  • altering column types

This avoids stale reflection after the SDK itself changes the target schema.

Tests

Added regression coverage for:

  • reusing a single Inspector across repeated SQLConnector reflection calls
  • clearing the cached Inspector reflection cache
  • avoiding stale reflection state after connector-managed DDL

Validation

  • uv run pytest tests/sql/test_connector.py -q
  • uv run pytest tests/sql -q
  • uv sync --all-groups --all-extras --all-packages
  • uv run pytest tests -q
  • uvx nox -t typing
  • uvx pre-commit run --all-files

Summary by Sourcery

Cache and reuse a SQLAlchemy Inspector within SQLConnector to enable reflection caching and invalidate it after connector-managed schema changes.

Bug Fixes:

  • Ensure SQLConnector reuses a single SQLAlchemy Inspector instance for reflection so dialect-level reflection caches are preserved across calls.
  • Clear the Inspector reflection cache after connector-managed DDL operations (schema/table creation, column add/rename/type change, table recreation) to avoid stale metadata.

Documentation:

  • Add AGENTS.md with contributor-facing guidance on development setup, commands, code style, exceptions, deprecation policy, architecture, testing, and project structure.

Tests:

  • Add regression tests covering Inspector reuse across reflection methods, reflection cache clearing behavior, and cache invalidation after DDL operations.

@sourcery-ai

sourcery-ai Bot commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

Reviewer's Guide

SQLConnector now caches and reuses a single SQLAlchemy Inspector instance across reflection methods and introduces a reflection cache clearing helper that is invoked after connector-driven DDL operations to prevent stale schema metadata; tests are added/updated to verify inspector reuse, cache clearing, and correct behavior after schema changes, plus a new AGENTS.md contributor guide is added.

Sequence diagram for SQLConnector reflection reuse and cache invalidation

sequenceDiagram
    actor Client
    participant SQLConnector
    participant reflection_Inspector as Inspector

    Client->>SQLConnector: table_exists(full_table_name)
    SQLConnector->>SQLConnector: _inspector
    SQLConnector->>Inspector: has_table(table_name, schema_name)
    Inspector-->>SQLConnector: bool
    SQLConnector-->>Client: bool

    Client->>SQLConnector: get_table_columns(full_table_name)
    SQLConnector->>SQLConnector: _inspector
    SQLConnector->>Inspector: get_columns(table_name, schema_name)
    Inspector-->>SQLConnector: columns
    SQLConnector-->>Client: columns

    Client->>SQLConnector: create_empty_table(full_table_name, columns, primary_key)
    SQLConnector->>SQLConnector: create_empty_table(full_table_name, columns, primary_key)
    SQLConnector-->>Client: None
    Client->>SQLConnector: clear_reflection_cache()
    SQLConnector->>SQLConnector: clear_reflection_cache()
    alt Inspector has clear_cache
        SQLConnector->>Inspector: clear_cache()
    else uses info_cache
        SQLConnector->>Inspector: info_cache.clear()
    end
Loading

File-Level Changes

Change Details Files
Add connector-level caching for SQLAlchemy Inspector and update reflection helpers to use it.
  • Introduce a _cached_inspector attribute initialized to None on SQLConnector.
  • Add an _inspector property that lazily creates and caches sa.inspect(self._engine).
  • Refactor table_exists, schema_exists, and get_table_columns to use the cached inspector instead of calling sa.inspect(self._engine) each time.
singer_sdk/sql/connector.py
tests/sql/test_connector.py
Add a reflection cache invalidation helper and wire it into DDL operations to avoid stale metadata.
  • Implement clear_reflection_cache to clear Inspector reflection state using clear_cache() when available or by clearing info_cache as a fallback.
  • Call clear_reflection_cache after create_schema, create_empty_table, _create_empty_column, prepare_table drop-and-recreate, rename_column, and _adapt_column_type.
  • Add tests asserting that connector-managed DDL operations clear or preserve the reflection cache as appropriate and that negative reflection is invalidated when a table is created.
singer_sdk/sql/connector.py
tests/sql/test_connector.py
Extend test coverage for inspector reuse and reflection cache behavior.
  • Add tests verifying that reflection methods reuse a single cached inspector and that sa.inspect is only called once per connector.
  • Add tests for clear_reflection_cache behavior, ensuring it does not create an inspector unnecessarily and that it clears existing inspector info_cache while preserving the inspector object.
  • Add regression tests ensuring column listing reflects schema changes after DDL operations like creating tables/columns and renaming columns.
tests/sql/test_connector.py
Add AGENTS.md with high-level development, style, and architecture guidelines for contributors and tooling.
  • Create AGENTS.md describing development setup commands, common Nox/pytest/ruff/pre-commit workflows.
  • Document code style expectations, exception hierarchy rules, deprecation policy, and high-level SDK architecture/testing structure.
  • Clarify supported Python versions and primary development version.
AGENTS.md

Assessment against linked issues

Issue Objective Addressed Explanation
#3478 Reuse a single cached SQLAlchemy Inspector instance within SQLConnector (instead of creating a new sa.inspect(self._engine) per call) for reflection-related methods such as table_exists(), schema_exists(), and get_table_columns(), so that Inspector.info_cache is preserved and dialect-level reflection caching is effective.
#3478 Provide a mechanism to clear or invalidate the persisted Inspector reflection cache when the connector performs schema-altering DDL operations, to avoid stale reflection results after schema changes.

Possibly linked issues


Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@sourcery-ai sourcery-ai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've reviewed your changes and they look great!


Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@read-the-docs-community

read-the-docs-community Bot commented Jun 3, 2026

Copy link
Copy Markdown

@codecov

codecov Bot commented Jun 3, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 94.20%. Comparing base (e508d72) to head (c3a3b1a).
⚠️ Report is 8 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3665      +/-   ##
==========================================
+ Coverage   94.13%   94.20%   +0.06%     
==========================================
  Files          73       73              
  Lines        6208     6227      +19     
  Branches      763      767       +4     
==========================================
+ Hits         5844     5866      +22     
+ Misses        270      269       -1     
+ Partials       94       92       -2     
Flag Coverage Δ
core 83.10% <100.00%> (+0.13%) ⬆️
end-to-end 76.07% <91.30%> (+0.04%) ⬆️
optional-components 44.80% <21.73%> (-0.06%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@codspeed-hq

codspeed-hq Bot commented Jun 3, 2026

Copy link
Copy Markdown

Merging this PR will not alter performance

✅ 8 untouched benchmarks


Comparing akashmalbari:fix-sqlconnector-inspector-cache (c3a3b1a) with main (5ac1a60)1

Open in CodSpeed

Footnotes

  1. No successful run was found on main (e508d72) during the generation of this report, so 5ac1a60 was used instead as the comparison base. There might be some changes unrelated to this pull request in this report.

@edgarrmondragon

Copy link
Copy Markdown
Collaborator

Thanks for the PR @akashmalbari!

The problem of stale reflection makes me hesitant to accept this as-is. Developers can override many of the DDL methods here where you're calling self.clear_reflection_cache(), so we could easily be feeding them a stale cache and producing subtle bugs unless we go and change their code after we release this 🙃.

I'll try to think about a different approach, but in the meantime wdyt?

@edgarrmondragon edgarrmondragon left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@akashmalbari

Copy link
Copy Markdown
Author

Thanks for the review, @edgarrmondragon.

I updated the approach so cache invalidation is no longer dependent on individual SQLConnector DDL helper methods calling clear_reflection_cache().

The connector now registers an engine-level after_cursor_execute listener and clears the cached inspector whenever DDL is executed through the connector engine (CREATE, ALTER, or DROP). This means subclass overrides that execute DDL through _engine / _connect() will still invalidate reflection cache even if they do not call clear_reflection_cache() directly.

I also added a regression test covering an overridden _create_empty_column() implementation that omits clear_reflection_cache() but still gets fresh reflection after DDL.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: SQLAlchemy reflection cache is defeated by creating new Inspector instances per call

2 participants