Skip to content

feat: support filtering SQL tap schemas#3671

Open
Dexter2099 wants to merge 1 commit into
meltano:mainfrom
Dexter2099:codex/sql-filter-schemas
Open

feat: support filtering SQL tap schemas#3671
Dexter2099 wants to merge 1 commit into
meltano:mainfrom
Dexter2099:codex/sql-filter-schemas

Conversation

@Dexter2099

@Dexter2099 Dexter2099 commented Jun 8, 2026

Copy link
Copy Markdown

Summary

  • add a filter_schemas SQL tap config option for limiting schema discovery
  • pass configured schema filters into SQL connector discovery
  • preserve compatibility with custom connectors that override discover_catalog_entries without the new keyword

Closes #1930.

Testing

  • uv run pytest tests/sql/test_connector.py -k "filter_schemas" -q
  • uv run pytest tests/sql/test_connector.py -q
  • uv run pytest tests/sql -q
  • uv run pytest tests/packages/test_tap_sqlite.py -m packages -q
  • pre-commit run --files singer_sdk/helpers/capabilities.py singer_sdk/sql/connector.py singer_sdk/sql/tap.py tests/sql/test_connector.py

Summary by Sourcery

Add configurable schema filtering to SQL taps while preserving compatibility with existing connectors' discovery signatures.

New Features:

  • Introduce a SQL tap configuration option to filter discovered schemas via the filter_schemas setting.

Enhancements:

  • Pass configured schema filters from SQLTap into SQLConnector discovery and apply them when enumerating schemas.
  • Extend SQL tap configuration schema to document and validate the new filter_schemas option.
  • Ensure SQL tap discovery gracefully handles custom connectors that use a legacy discover_catalog_entries signature without filter_schemas.

Tests:

  • Add SQL connector and SQL tap tests covering schema filtering behavior and compatibility with legacy connector discovery implementations.

@sourcery-ai

sourcery-ai Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Reviewer's Guide

Adds a new filter_schemas configuration option for SQL taps, wires it through to SQL connector discovery, and introduces a compatibility layer so taps using legacy discover_catalog_entries signatures continue to work.

Sequence diagram for SQLTap catalog discovery with filter_schemas support

sequenceDiagram
    participant SQLTap
    participant Helper as _filter_discovery_kwargs
    participant Connector as SQLConnector

    SQLTap->>SQLTap: catalog_dict()
    SQLTap->>SQLTap: build discovery_kwargs
    SQLTap->>Helper: _filter_discovery_kwargs(connector, discovery_kwargs)
    Helper->>Connector: inspect.signature(discover_catalog_entries)
    alt [connector accepts_var_kwargs]
        Helper-->>SQLTap: discovery_kwargs (unmodified)
    else [connector has explicit parameters]
        Helper-->>SQLTap: filtered discovery_kwargs
    end
    SQLTap->>Connector: discover_catalog_entries(**filtered_kwargs)
    Connector-->>SQLTap: list[dict] streams
    SQLTap->>SQLTap: set _catalog_dict[streams]
    SQLTap-->>SQLTap: return _catalog_dict
Loading

File-Level Changes

Change Details Files
Introduce filter_schemas support in SQLConnector discovery to limit schemas returned.
  • Extend discover_catalog_entries to accept a filter_schemas sequence alongside exclude_schemas and reflect_indices.
  • Compute included_schemas from filter_schemas and skip schemas not in this set when it is non-empty.
  • Keep existing behavior when filter_schemas is empty by discovering all schemas except those explicitly excluded.
singer_sdk/sql/connector.py
Expose filter_schemas as a standard SQL tap configuration option and pass it into discovery with backward-compatible kwargs filtering.
  • Define SQL_TAP_FILTER_SCHEMAS capability with a filter_schemas array property and descriptive metadata.
  • Update SQLTap.append_builtin_config to merge the new SQL_TAP_FILTER_SCHEMAS config schema into taps.
  • In SQLTap.catalog_dict, build discovery_kwargs including exclude_schemas and filter_schemas from tap config.
  • Introduce _filter_discovery_kwargs to prune unsupported kwargs unless the connector accepts **kwargs, then call discover_catalog_entries with the filtered kwargs.
singer_sdk/helpers/capabilities.py
singer_sdk/sql/tap.py
Add tests and dummy tap/connector implementations to validate filter_schemas behavior and legacy connector compatibility.
  • Create DummySQLStream and DummySQLTap classes wired to DummySQLConnector for testing tap behavior.
  • Add LegacyDiscoverySQLConnector/Stream/Tap variants that simulate a connector with an older discover_catalog_entries signature lacking filter_schemas.
  • Add test_discover_catalog_entries_filter_schemas to verify connector-level filtering for included schemas.
  • Add tests to ensure SQLTap passes configured filter_schemas into discovery and remains compatible with legacy connector signatures.
tests/sql/test_connector.py

Assessment against linked issues

Issue Objective Addressed Explanation
#1930 Add a SQL tap configuration option to filter which database schemas are included during discovery, and wire it through to the SQL connector so only the specified schemas are queried.
#1930 Update SQL tap and connector behavior and tests to support schema filtering while preserving compatibility with existing/custom connectors that may use an older discovery method signature.

Possibly linked issues


Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@sourcery-ai sourcery-ai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 1 issue, and left some high level feedback:

  • The _filter_discovery_kwargs helper calls inspect.signature on every discovery; consider caching the accepted keyword names per connector class (e.g., via functools.lru_cache keyed by type(connector)) to avoid repeated reflection overhead during large discoveries.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The `_filter_discovery_kwargs` helper calls `inspect.signature` on every discovery; consider caching the accepted keyword names per connector class (e.g., via `functools.lru_cache` keyed by `type(connector)`) to avoid repeated reflection overhead during large discoveries.

## Individual Comments

### Comment 1
<location path="singer_sdk/sql/tap.py" line_range="126-128" />
<code_context>

         connector = self.tap_connector

+        discovery_kwargs = {
+            "exclude_schemas": self.exclude_schemas,
+            "filter_schemas": self.config.get("filter_schemas", []),
+        }
         self._catalog_dict = {
</code_context>
<issue_to_address>
**issue (bug_risk):** Guard against `filter_schemas` being explicitly set to null/None in config

If the config explicitly sets `"filter_schemas": null`, `self.config.get("filter_schemas", [])` will return `None`. This is passed to `SQLConnector.discover_catalog_entries`, which does `set(filter_schemas)` and raises a `TypeError`.

To avoid this, normalize the value here, e.g.:

```python
filter_schemas = self.config.get("filter_schemas") or []
```

so `null` is treated as an empty list and discovery doesn’t fail.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment thread singer_sdk/sql/tap.py Outdated
@codecov

codecov Bot commented Jun 8, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 94.14%. Comparing base (703866f) to head (56a7857).
⚠️ Report is 21 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3671      +/-   ##
==========================================
+ Coverage   94.12%   94.14%   +0.02%     
==========================================
  Files          73       73              
  Lines        6198     6220      +22     
  Branches      762      766       +4     
==========================================
+ Hits         5834     5856      +22     
  Misses        270      270              
  Partials       94       94              
Flag Coverage Δ
core 83.36% <100.00%> (+0.41%) ⬆️
end-to-end 75.99% <69.56%> (-0.03%) ⬇️
optional-components 44.74% <26.08%> (-0.08%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@read-the-docs-community

read-the-docs-community Bot commented Jun 8, 2026

Copy link
Copy Markdown

Documentation build overview

📚 Meltano SDK | 🛠️ Build #33032862 | 📁 Comparing 56a7857 against latest (703866f)

  🔍 Preview build  

1 file changed
± classes/singer_sdk.sql.SQLConnector.html

@codspeed-hq

codspeed-hq Bot commented Jun 8, 2026

Copy link
Copy Markdown

Merging this PR will not alter performance

✅ 8 untouched benchmarks


Comparing Dexter2099:codex/sql-filter-schemas (56a7857) with main (703866f)

Open in CodSpeed

@Dexter2099 Dexter2099 force-pushed the codex/sql-filter-schemas branch from c83a4b7 to c5468f4 Compare June 8, 2026 04:09
@Dexter2099 Dexter2099 force-pushed the codex/sql-filter-schemas branch from c5468f4 to 56a7857 Compare June 8, 2026 04:14
@Dexter2099

Copy link
Copy Markdown
Author

Addressed Sourcery feedback in 56a7857: filter_schemas now normalizes null/None to an empty list, discovery signature inspection is cached per connector class, and tests cover both the null config case and connectors accepting **kwargs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: SQL Taps - Filter Schemas

1 participant