fix: init SQLite schema per DB path, not via global flag (#42)#43
Conversation
The module-level _DB_INITIALIZED bool was set True after the first provider's schema creation. Switching provider opens a new DB file in a different cache dir, but the flag stayed True so the schema was never created — raising "no such table: invalid_datasets". Track initialized DB paths in a set (_INITIALIZED_DBS) so each new DB file gets its own schema. Add a cross-provider regression test. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
| Filename | Overview |
|---|---|
| src/opensdmx/db_cache.py | Replaces the module-level boolean _DB_INITIALIZED with a per-path set _INITIALIZED_DBS; schema creation is now keyed on the resolved absolute DB path. Logic is correct and idempotent (CREATE TABLE IF NOT EXISTS). The not db_existed branch correctly handles externally-deleted files in long-lived processes. |
| tests/test_db_cache.py | Adds test_schema_init_per_db_path regression test covering the two-provider scenario from issue #42. Fixture updated to reset the new _INITIALIZED_DBS set. Test is clear and targeted. |
| tests/test_http.py | Two setup helpers updated to monkeypatch _INITIALIZED_DBS instead of the old _DB_INITIALIZED boolean — straightforward rename, no logic change. |
Reviews (2): Last reviewed commit: "fix: harden DB schema guard (resolve pat..." | Re-trigger Greptile
There was a problem hiding this comment.
Pull request overview
Fixes a SQLite cache initialization bug where switching providers (and therefore DB files) could skip schema creation and crash on first access (no such table: invalid_datasets). The fix tracks schema initialization per DB path and adds a regression test for the two-provider scenario.
Changes:
- Replace the module-level
_DB_INITIALIZEDboolean with_INITIALIZED_DBS: set[str]keyed by DB path. - Add regression test
test_schema_init_per_db_pathto ensure schema initializes when_get_db_path()changes. - Update existing tests to reset
_INITIALIZED_DBSinstead of the removed flag.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
src/opensdmx/db_cache.py |
Track schema initialization per DB path so provider switching initializes each DB’s schema. |
tests/test_db_cache.py |
Add regression test covering schema init across two distinct DB paths. |
tests/test_http.py |
Update fixtures to reset the new per-DB initialization tracking between tests. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Address Copilot review on #43: - key _INITIALIZED_DBS on the resolved absolute path so a relative OPENSDMX_CACHE_DIR + CWD change can't alias two different DB files - re-create the schema when the DB file did not exist before connecting, covering cache.db deleted during a long-lived process Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
@aborruso Unfortunately I hit an unexpected error while processing your comment. I've automatically reported this to GitHub. You can ask me to try again later by mentioning me in a new comment. If you want to contact GitHub about this error, please mention the following identifier so they can better serve you: Sorry for the inconvenience! |
Fixes #42.
Problem
set_provider()switching (e.g.eurostat→oecd) made the firstall_available()on the new provider crash withsqlite3.OperationalError: no such table: invalid_datasets.Root cause:
_DB_INITIALIZEDwas a module-levelbool. The firstprovider created its schema and set the flag to
True; the next provideropens a different DB file (separate cache dir), but the flag stayed
True, so the schema was never created on the new file.This is a library bug (reproduced via
import opensdmx), not a CLI one.Fix
Replace the global bool with
_INITIALIZED_DBS: set[str]keyed on the DBpath, so every new DB file initializes its own schema. (Option 1 from the
issue.)
Verification
test_schema_init_per_db_pathreproducing thetwo-provider scenario — proven to fail with the exact issue error on the
old code, passes with the fix.
ruffclean.eurostatthenoecdboth return cleanly, noOperationalError.🤖 Generated with Claude Code