feat: support filtering SQL tap schemas#3671
Conversation
Reviewer's GuideAdds a new filter_schemas configuration option for SQL taps, wires it through to SQL connector discovery, and introduces a compatibility layer so taps using legacy discover_catalog_entries signatures continue to work. Sequence diagram for SQLTap catalog discovery with filter_schemas supportsequenceDiagram
participant SQLTap
participant Helper as _filter_discovery_kwargs
participant Connector as SQLConnector
SQLTap->>SQLTap: catalog_dict()
SQLTap->>SQLTap: build discovery_kwargs
SQLTap->>Helper: _filter_discovery_kwargs(connector, discovery_kwargs)
Helper->>Connector: inspect.signature(discover_catalog_entries)
alt [connector accepts_var_kwargs]
Helper-->>SQLTap: discovery_kwargs (unmodified)
else [connector has explicit parameters]
Helper-->>SQLTap: filtered discovery_kwargs
end
SQLTap->>Connector: discover_catalog_entries(**filtered_kwargs)
Connector-->>SQLTap: list[dict] streams
SQLTap->>SQLTap: set _catalog_dict[streams]
SQLTap-->>SQLTap: return _catalog_dict
File-Level Changes
Assessment against linked issues
Possibly linked issues
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey - I've found 1 issue, and left some high level feedback:
- The
_filter_discovery_kwargshelper callsinspect.signatureon every discovery; consider caching the accepted keyword names per connector class (e.g., viafunctools.lru_cachekeyed bytype(connector)) to avoid repeated reflection overhead during large discoveries.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The `_filter_discovery_kwargs` helper calls `inspect.signature` on every discovery; consider caching the accepted keyword names per connector class (e.g., via `functools.lru_cache` keyed by `type(connector)`) to avoid repeated reflection overhead during large discoveries.
## Individual Comments
### Comment 1
<location path="singer_sdk/sql/tap.py" line_range="126-128" />
<code_context>
connector = self.tap_connector
+ discovery_kwargs = {
+ "exclude_schemas": self.exclude_schemas,
+ "filter_schemas": self.config.get("filter_schemas", []),
+ }
self._catalog_dict = {
</code_context>
<issue_to_address>
**issue (bug_risk):** Guard against `filter_schemas` being explicitly set to null/None in config
If the config explicitly sets `"filter_schemas": null`, `self.config.get("filter_schemas", [])` will return `None`. This is passed to `SQLConnector.discover_catalog_entries`, which does `set(filter_schemas)` and raises a `TypeError`.
To avoid this, normalize the value here, e.g.:
```python
filter_schemas = self.config.get("filter_schemas") or []
```
so `null` is treated as an empty list and discovery doesn’t fail.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #3671 +/- ##
==========================================
+ Coverage 94.12% 94.14% +0.02%
==========================================
Files 73 73
Lines 6198 6220 +22
Branches 762 766 +4
==========================================
+ Hits 5834 5856 +22
Misses 270 270
Partials 94 94
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
Documentation build overview
|
c83a4b7 to
c5468f4
Compare
c5468f4 to
56a7857
Compare
|
Addressed Sourcery feedback in 56a7857: |
Summary
filter_schemasSQL tap config option for limiting schema discoverydiscover_catalog_entrieswithout the new keywordCloses #1930.
Testing
uv run pytest tests/sql/test_connector.py -k "filter_schemas" -quv run pytest tests/sql/test_connector.py -quv run pytest tests/sql -quv run pytest tests/packages/test_tap_sqlite.py -m packages -qpre-commit run --files singer_sdk/helpers/capabilities.py singer_sdk/sql/connector.py singer_sdk/sql/tap.py tests/sql/test_connector.pySummary by Sourcery
Add configurable schema filtering to SQL taps while preserving compatibility with existing connectors' discovery signatures.
New Features:
Enhancements:
Tests: