feat: allow disabling SQL view discovery#3670
Conversation
Reviewer's GuideAdds configurable flags to control whether SQL catalog discovery includes views and materialized views, wires those flags through the SQL tap configuration and connector discovery logic, and adds regression tests to validate behavior and configuration plumbing. Sequence diagram for SQL tap discovery with view flagssequenceDiagram
actor User
participant SQLTap
participant SQLConnector
participant SAInspector
User->>SQLTap: catalog_dict()
SQLTap->>SQLTap: config.get(discover_views)
SQLTap->>SQLTap: config.get(discover_materialized_views)
SQLTap->>SQLConnector: discover_catalog_entries(exclude_schemas, discover_views, discover_materialized_views)
SQLConnector->>SAInspector: sa.inspect(engine)
SQLConnector->>SQLConnector: object_kinds = [(ObjectKind.TABLE, False)]
opt [discover_views enabled]
SQLConnector->>SQLConnector: object_kinds.append((ObjectKind.VIEW, True))
end
opt [discover_materialized_views enabled]
SQLConnector->>SQLConnector: object_kinds.append((ObjectKind.MATERIALIZED_VIEW, True))
end
loop schemas from get_schema_names()
SQLConnector->>SQLConnector: skip if schema in exclude_schemas
SQLConnector->>SQLConnector: reflect object_kinds into catalog entries
end
SQLConnector-->>SQLTap: list[dict] streams
SQLTap-->>User: catalog_dict
File-Level Changes
Assessment against linked issues
Possibly linked issues
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
Documentation build overview
|
There was a problem hiding this comment.
Hey - I've found 1 issue, and left some high level feedback:
- Passing
discover_viewsanddiscover_materialized_viewsas new keyword-only arguments fromSQLTap.catalog_dict()intoSQLConnector.discover_catalog_entries()may break custom connectors that overridediscover_catalog_entrieswithout accepting these parameters; consider either keeping the call signature backward-compatible (e.g., passing onlyexclude_schemasfor now or using a helper/**kwargspattern) or adding a compatibility shim in the base connector. - The default values for
discover_viewsanddiscover_materialized_viewsare now duplicated betweenSQL_TAP_DISCOVERY_CONFIG(JSON schema defaults) and theconfig.get(..., True)calls inSQLTap.catalog_dict; consider relying on a single source of truth (e.g., the schema-driven config) to avoid drift in future changes.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Passing `discover_views` and `discover_materialized_views` as new keyword-only arguments from `SQLTap.catalog_dict()` into `SQLConnector.discover_catalog_entries()` may break custom connectors that override `discover_catalog_entries` without accepting these parameters; consider either keeping the call signature backward-compatible (e.g., passing only `exclude_schemas` for now or using a helper/`**kwargs` pattern) or adding a compatibility shim in the base connector.
- The default values for `discover_views` and `discover_materialized_views` are now duplicated between `SQL_TAP_DISCOVERY_CONFIG` (JSON schema defaults) and the `config.get(..., True)` calls in `SQLTap.catalog_dict`; consider relying on a single source of truth (e.g., the schema-driven config) to avoid drift in future changes.
## Individual Comments
### Comment 1
<location path="tests/sql/test_connector.py" line_range="68-75" />
<code_context>
)
+class SpySQLConnector(SQLConnector):
+ """SQL connector that captures discovery options."""
+
+ discovery_kwargs: t.ClassVar[dict[str, t.Any]] = {}
+
+ def discover_catalog_entries(self, **kwargs: t.Any) -> list[dict]:
+ self.__class__.discovery_kwargs = kwargs
+ return []
+
+
</code_context>
<issue_to_address>
**suggestion (testing):** Reset `SpySQLConnector.discovery_kwargs` between tests to avoid cross-test coupling
Since `discovery_kwargs` is a class variable, its state will persist across test runs and can be shared between tests that use `SpySQLConnector`. To keep tests isolated and order-independent, reset `SpySQLConnector.discovery_kwargs` (for example, to `{}`) in the relevant test or via a fixture before instantiating `SpySQLTap`, so the assertion on `discovery_kwargs` remains reliable as more tests are added.
Suggested implementation:
```python
import typing as t
import pytest
from singer_sdk.exceptions import ConfigValidationError
```
```python
class SpySQLConnector(SQLConnector):
"""SQL connector that captures discovery options."""
discovery_kwargs: t.ClassVar[dict[str, t.Any]] = {}
def discover_catalog_entries(self, **kwargs: t.Any) -> list[dict]:
"""Capture discovery kwargs and return an empty catalog."""
self.__class__.discovery_kwargs = kwargs
return []
@pytest.fixture(autouse=True)
def reset_spy_sql_connector_discovery_kwargs() -> None:
"""Reset SpySQLConnector discovery state between tests in this module."""
SpySQLConnector.discovery_kwargs = {}
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| class SpySQLConnector(SQLConnector): | ||
| """SQL connector that captures discovery options.""" | ||
|
|
||
| discovery_kwargs: t.ClassVar[dict[str, t.Any]] = {} | ||
|
|
||
| def discover_catalog_entries(self, **kwargs: t.Any) -> list[dict]: | ||
| self.__class__.discovery_kwargs = kwargs | ||
| return [] |
There was a problem hiding this comment.
suggestion (testing): Reset SpySQLConnector.discovery_kwargs between tests to avoid cross-test coupling
Since discovery_kwargs is a class variable, its state will persist across test runs and can be shared between tests that use SpySQLConnector. To keep tests isolated and order-independent, reset SpySQLConnector.discovery_kwargs (for example, to {}) in the relevant test or via a fixture before instantiating SpySQLTap, so the assertion on discovery_kwargs remains reliable as more tests are added.
Suggested implementation:
import typing as t
import pytest
from singer_sdk.exceptions import ConfigValidationErrorclass SpySQLConnector(SQLConnector):
"""SQL connector that captures discovery options."""
discovery_kwargs: t.ClassVar[dict[str, t.Any]] = {}
def discover_catalog_entries(self, **kwargs: t.Any) -> list[dict]:
"""Capture discovery kwargs and return an empty catalog."""
self.__class__.discovery_kwargs = kwargs
return []
@pytest.fixture(autouse=True)
def reset_spy_sql_connector_discovery_kwargs() -> None:
"""Reset SpySQLConnector discovery state between tests in this module."""
SpySQLConnector.discovery_kwargs = {}272d8bf to
de52726
Compare
de52726 to
6dafa1e
Compare
|
Note on the currently failing coverage jobs: all test, typing, CodeQL, Sourcery, and benchmark checks are passing on the latest head. The three coverage jobs are failing after I force-pushed once to rerun the checks and the same Codecov CLI signature failure repeated, so this appears external to the PR code. |
Fixes #2806.
Summary
discover_viewsanddiscover_materialized_views.SQLTapinto catalog discovery.SQLConnector.discover_catalog_entries()reflect tables, views, and materialized views according to those flags while preserving the existing default behavior.Validation
uv run pytest tests/sql/test_connector.py::TestDummySQLConnector::test_discover_catalog_entries_can_exclude_views tests/sql/test_connector.py::TestDummySQLConnector::test_discover_catalog_entries_can_filter_materialized_views tests/sql/test_connector.py::test_sql_tap_passes_discovery_settings_from_config -quv run pytest tests/sql/test_connector.py -quv run pytest tests/sql -qpre-commit run --files singer_sdk/helpers/capabilities.py singer_sdk/sql/connector.py singer_sdk/sql/tap.py tests/sql/test_connector.pyNOX_DEFAULT_VENV_BACKEND=uv nox -rs tests-3.11 -- tests/sqluv run pytest tests/packages/test_tap_sqlite.py -m packages -qSummary by Sourcery
Allow SQL taps and connectors to control whether views and materialized views are included during catalog discovery while preserving existing default behavior.
New Features:
Tests: