Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .beads/issues.jsonl
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
{"id":"vgi-python-34l","title":"Create catalog integration tests","description":"Create comprehensive test suite for catalog interface.\n\nFiles to create:\n- tests/catalog/__init__.py\n- tests/catalog/test_types.py\n- tests/catalog/test_serialization.py\n- tests/catalog/test_catalog_interface.py\n- tests/catalog/test_catalog_client.py\n- tests/catalog/test_integration.py\n\ntest_types.py:\n- Test all dataclass instantiation\n- Test frozen immutability\n- Test type alias usage\n\ntest_serialization.py:\n- Round-trip tests for all types\n- Edge cases: empty strings, empty lists, None values\n- Invalid schema rejection\n\ntest_catalog_interface.py:\n- Test abstract method enforcement\n- Test default implementations\n- Test NotImplementedError for optional methods\n- Test ReadOnlyCatalogInterface\n\ntest_catalog_client.py:\n- Mock worker tests for each client method\n- Error handling tests\n- Streaming response tests\n\ntest_integration.py:\n- End-to-end client ↔ worker tests using InMemoryCatalog\n- Catalog lifecycle: attach, query, detach\n- Schema operations\n- Table operations\n- Error propagation\n\nProtocol conformance tests:\n- Invalid schemas (wrong column types)\n- Missing required columns\n- Multi-row input batches (should fail)\n- Extra columns (should be ignored)","status":"closed","priority":1,"issue_type":"task","created_at":"2026-01-05T19:18:24.163718-05:00","created_by":"rusty","updated_at":"2026-01-05T19:21:50.050722-05:00","closed_at":"2026-01-05T19:21:50.050722-05:00","close_reason":"User requested closure","dependencies":[{"issue_id":"vgi-python-34l","depends_on_id":"vgi-python-e46","type":"blocks","created_at":"2026-01-05T19:18:44.818195-05:00","created_by":"rusty"},{"issue_id":"vgi-python-34l","depends_on_id":"vgi-python-vxy","type":"blocks","created_at":"2026-01-05T19:18:48.578947-05:00","created_by":"rusty"},{"issue_id":"vgi-python-34l","depends_on_id":"vgi-python-nju","type":"blocks","created_at":"2026-01-05T19:18:48.686098-05:00","created_by":"rusty"}]}
{"id":"vgi-python-35i","title":"Test SchemaValidationError detailed message paths","notes":"Coverage: 67% in vgi/exceptions.py. Missing tests for:\n- Lines 116-123: Type mismatch detection in _build_detailed_message\n- Lines 128-131: Field order difference detection \n- Lines 149-151: Type mismatch reporting\n- Lines 155-157: Field order difference reporting\n\nTest scenarios needed:\n1. Schema with same fields but different types\n2. Schema with nullable vs non-nullable mismatch\n3. Schema with same fields in different order","status":"closed","priority":3,"issue_type":"task","created_at":"2026-01-04T22:15:25.858704-05:00","created_by":"rusty","updated_at":"2026-01-04T22:25:58.697852-05:00","closed_at":"2026-01-04T22:25:58.697852-05:00","close_reason":"Added comprehensive tests for SchemaValidationError. Coverage improved from 67% to 99%."}
{"id":"vgi-python-36f","title":"Split metadata.py Arrow serialization into separate module","description":"metadata.py is 932 lines with two distinct concerns: 1) metadata resolution (enums, dataclasses, parameter extraction, resolve_metadata) and 2) Arrow serialization (schema definitions, to_arrow/from_arrow functions). Split Arrow serialization into metadata_serialization.py or metadata_arrow.py for better separation of concerns.","status":"closed","priority":3,"issue_type":"task","created_at":"2026-01-04T20:06:53.481364-05:00","created_by":"rusty","updated_at":"2026-01-04T22:01:52.100108-05:00","closed_at":"2026-01-04T22:01:52.100108-05:00","close_reason":"Not warranted - metadata.py is well-organized with clear section headers. Arrow serialization (~165 lines) is tightly coupled to data classes (uses their to_dict/from_dict methods). Splitting would add import complexity without significant benefit."}
{"id":"vgi-python-3bn","title":"Fix schema_contents() default implementation","description":"Implement the default schema_contents() method that has a FIXME comment.\n\nFile: vgi/catalog/catalog_interface.py\n\nCurrent state: \n```python\ndef schema_contents(...) -\u003e Iterable[TableInfo | ViewInfo | FunctionInfo]:\n # FIXME: write this implementation for the worker.\n```\n\nThe default implementation should:\n1. Access the Worker's registered functions\n2. Convert each function's metadata to FunctionInfo\n3. Return them as the contents of the 'main' schema\n\nThis requires:\n- Access to Worker.functions registry\n- Converting function metadata to FunctionInfo format\n- May need to pass worker reference to CatalogInterface\n\nAlternative: Just raise NotImplementedError and require subclasses to implement.","status":"open","priority":2,"issue_type":"task","created_at":"2026-01-05T19:27:27.482694-05:00","created_by":"rusty","updated_at":"2026-01-05T19:27:27.482694-05:00"}
{"id":"vgi-python-3bn","title":"Fix schema_contents() default implementation","description":"Implement the default schema_contents() method that has a FIXME comment.\n\nFile: vgi/catalog/catalog_interface.py\n\nCurrent state: \n```python\ndef schema_contents(...) -\u003e Iterable[TableInfo | ViewInfo | FunctionInfo]:\n # FIXME: write this implementation for the worker.\n```\n\nThe default implementation should:\n1. Access the Worker's registered functions\n2. Convert each function's metadata to FunctionInfo\n3. Return them as the contents of the 'main' schema\n\nThis requires:\n- Access to Worker.functions registry\n- Converting function metadata to FunctionInfo format\n- May need to pass worker reference to CatalogInterface\n\nAlternative: Just raise NotImplementedError and require subclasses to implement.","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-05T19:27:27.482694-05:00","created_by":"rusty","updated_at":"2026-01-05T19:59:02.922654-05:00","closed_at":"2026-01-05T19:59:02.922654-05:00","close_reason":"Already fixed in PR #24 - now raises NotImplementedError"}
{"id":"vgi-python-3fq","title":"Abstract common worker batch processing logic","description":"Worker batch processing methods _process_scalar_batches (377-466), _process_batches (468-550), and _generate_batches (552-593) share significant structure: IPC writer/reader setup, batch counting/logging, main processing loop. Extract common logic to reduce duplication - consider a BatchProcessor helper class or template method pattern.","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-04T20:06:53.350497-05:00","created_by":"rusty","updated_at":"2026-01-04T21:20:24.509785-05:00","closed_at":"2026-01-04T21:20:24.509785-05:00","close_reason":"Analysis complete: abstraction not warranted. The three methods have sufficiently different logic (input handling, log message loops, protocol types) that abstracting them would add complexity without meaningful benefit. Current code is already readable at ~70-90 lines each."}
{"id":"vgi-python-4mg","title":"Add InvocationType.CATALOG to vgi/invocation.py","description":"Extend InvocationType enum to support catalog invocations.\n\nFiles to modify:\n- vgi/invocation.py\n\nChanges:\n1. Add CATALOG = 'catalog' to InvocationType enum\n2. Update docstrings to document the new type\n3. Ensure serialization/deserialization handles the new value\n\nThe CATALOG invocation type indicates the function_name field contains a CatalogInterface method name (e.g., 'catalog_attach', 'schemas', 'table_get').\n\nTest that CATALOG type serializes and deserializes correctly.","status":"closed","priority":1,"issue_type":"task","created_at":"2026-01-05T19:17:45.649509-05:00","created_by":"rusty","updated_at":"2026-01-05T19:21:50.062449-05:00","closed_at":"2026-01-05T19:21:50.062449-05:00","close_reason":"User requested closure"}
{"id":"vgi-python-5er","title":"Extract _should_terminate into shared base class","description":"Identical _should_terminate method is copy-pasted in all three function modules. Implementation is always: check if log_message exists and level is EXCEPTION. Move to shared base class (Function or new ProcessingMixin) to eliminate duplication.","status":"closed","priority":3,"issue_type":"task","created_at":"2026-01-04T20:06:41.190482-05:00","created_by":"rusty","updated_at":"2026-01-04T21:49:59.765614-05:00","closed_at":"2026-01-04T21:49:59.765614-05:00","close_reason":"Completed as part of PR #8 - _should_terminate moved to Function base class","dependencies":[{"issue_id":"vgi-python-5er","depends_on_id":"vgi-python-6o0","type":"blocks","created_at":"2026-01-04T20:07:49.283865-05:00","created_by":"rusty"}]}
Expand Down Expand Up @@ -44,7 +44,7 @@
{"id":"vgi-python-e46","title":"Create vgi/catalog/client.py - CatalogClient class","description":"Create CatalogClient for client-side catalog operations.\n\nFiles to create:\n- vgi/catalog/client.py\n\nCatalogClient class:\n- __init__(worker_command: str)\n- Context manager support (__enter__, __exit__)\n- start() / stop() methods\n\nCore methods (mirroring CatalogInterface):\n- catalogs() -\u003e list[str]\n- attach(name, options) -\u003e CatalogAttachResult\n- detach(attach_id) -\u003e None\n- schemas(attach_id, transaction_id) -\u003e Iterator[SchemaInfo]\n- schema_get(attach_id, transaction_id, name) -\u003e SchemaInfo | None\n- schema_contents(attach_id, transaction_id, name) -\u003e Iterator[TableInfo | ViewInfo | FunctionInfo]\n- table_get(...) -\u003e TableInfo | None\n- view_get(...) -\u003e ViewInfo | None\n- function_get(...) -\u003e FunctionInfo | None\n- table_scan_function_get(...) -\u003e ScanFunctionResult\n\nDDL methods (optional, may raise NotImplementedError from worker):\n- catalog_create, catalog_drop\n- schema_create, schema_drop\n- table_create, table_drop, table_rename, etc.\n- view_create, view_drop, view_rename, etc.\n\nTransaction methods:\n- transaction_begin(attach_id) -\u003e TransactionId | None\n- transaction_commit(attach_id, transaction_id)\n- transaction_rollback(attach_id, transaction_id)\n\nInternal methods:\n- _invoke(method_name, **kwargs) -\u003e pa.RecordBatch | Iterator[pa.RecordBatch]\n- _create_invocation(method_name, kwargs) -\u003e Invocation\n- _deserialize_result(batch, return_type) -\u003e Any\n\nHandle:\n- Streaming responses for Iterable returns\n- Exception propagation from worker\n- None returns (0-row/0-column batches)","status":"closed","priority":1,"issue_type":"task","created_at":"2026-01-05T19:18:04.65125-05:00","created_by":"rusty","updated_at":"2026-01-05T19:21:50.055794-05:00","closed_at":"2026-01-05T19:21:50.055794-05:00","close_reason":"User requested closure","dependencies":[{"issue_id":"vgi-python-e46","depends_on_id":"vgi-python-tw7","type":"blocks","created_at":"2026-01-05T19:18:44.316642-05:00","created_by":"rusty"},{"issue_id":"vgi-python-e46","depends_on_id":"vgi-python-fd2","type":"blocks","created_at":"2026-01-05T19:18:44.440065-05:00","created_by":"rusty"},{"issue_id":"vgi-python-e46","depends_on_id":"vgi-python-4mg","type":"blocks","created_at":"2026-01-05T19:18:44.559963-05:00","created_by":"rusty"}]}
{"id":"vgi-python-e6o","title":"Implement CatalogClient class","description":"Create CatalogClient for client-side catalog operations.\n\nFile: vgi/client/catalog_client.py\n\nCatalogClient class:\n- __init__(worker_command: str)\n- Each method call spawns new worker (matches VGI short-lived pattern)\n\nCore methods mirroring CatalogInterface:\n- catalogs() -\u003e list[str]\n- catalog_attach(name, options) -\u003e CatalogAttachResult\n- catalog_detach(attach_id) -\u003e None\n- schemas(attach_id, transaction_id) -\u003e Iterator[SchemaInfo]\n- schema_get(...) -\u003e SchemaInfo | None\n- schema_contents(...) -\u003e Iterator[TableInfo | ViewInfo | FunctionInfo]\n- table_get(...) -\u003e TableInfo | None\n- view_get(...) -\u003e ViewInfo | None\n- table_scan_function_get(...) -\u003e ScanFunctionResult\n\nDDL methods (may raise NotImplementedError from worker):\n- catalog_create, catalog_drop\n- schema_create, schema_drop\n- table_* methods, view_* methods\n\nTransaction methods:\n- catalog_transaction_begin/commit/rollback\n\nInternal:\n- _invoke(method_name, **kwargs) -\u003e pa.RecordBatch | Iterator[pa.RecordBatch]\n- _create_invocation(method_name, kwargs) -\u003e Invocation (with InvocationType.CATALOG)\n- Uses existing IPC utilities for communication","status":"closed","priority":1,"issue_type":"task","created_at":"2026-01-05T19:26:57.975309-05:00","created_by":"rusty","updated_at":"2026-01-05T19:48:33.366915-05:00","closed_at":"2026-01-05T19:48:33.366915-05:00","close_reason":"PR #27 created with CatalogClient implementation","dependencies":[{"issue_id":"vgi-python-e6o","depends_on_id":"vgi-python-085","type":"blocks","created_at":"2026-01-05T19:27:50.730122-05:00","created_by":"rusty"},{"issue_id":"vgi-python-e6o","depends_on_id":"vgi-python-po3","type":"blocks","created_at":"2026-01-05T19:27:50.762036-05:00","created_by":"rusty"}]}
{"id":"vgi-python-e9q","title":"Unify ProtocolOutput classes with shared base","description":"ProtocolOutput classes in table_function.py:177-224 and table_in_out_function.py:144-207 share similar metadata() method and from_process_result() classmethod. The table_in_out version adds status field. Create shared base with table_in_out extending it for status support.","status":"closed","priority":3,"issue_type":"task","created_at":"2026-01-04T20:06:41.45014-05:00","created_by":"rusty","updated_at":"2026-01-04T21:54:55.871986-05:00","closed_at":"2026-01-04T21:54:55.871986-05:00","close_reason":"Not warranted - dataclass inheritance with slots=True doesn't allow adding required field (status) between inherited fields. The classes have different semantics (table_in_out requires status for generator state tracking) making inheritance impractical."}
{"id":"vgi-python-eg7","title":"Create InMemoryCatalog example implementation","description":"Create an in-memory catalog implementation for testing and as an example.\n\nFile: vgi/examples/catalog.py\n\nInMemoryCatalog(CatalogInterface):\n- In-memory storage using dicts\n- Implements all required abstract methods\n- Implements common optional methods (schema_create, table_create, etc.)\n- Generates attach_id as random UUID bytes\n- Does NOT support transactions (returns None)\n\nData structures:\n- _catalogs: dict[str, CatalogData]\n- _attachments: dict[AttachId, str] # attach_id -\u003e catalog_name\n\nCreate example worker:\n```python\nclass InMemoryCatalogWorker(Worker):\n catalog_interface = InMemoryCatalog\n```\n\nAdd entry point: vgi-example-catalog-worker","status":"open","priority":2,"issue_type":"task","created_at":"2026-01-05T19:27:27.604912-05:00","created_by":"rusty","updated_at":"2026-01-05T19:27:27.604912-05:00","dependencies":[{"issue_id":"vgi-python-eg7","depends_on_id":"vgi-python-085","type":"blocks","created_at":"2026-01-05T19:27:50.87322-05:00","created_by":"rusty"}]}
{"id":"vgi-python-eg7","title":"Create InMemoryCatalog example implementation","description":"Create an in-memory catalog implementation for testing and as an example.\n\nFile: vgi/examples/catalog.py\n\nInMemoryCatalog(CatalogInterface):\n- In-memory storage using dicts\n- Implements all required abstract methods\n- Implements common optional methods (schema_create, table_create, etc.)\n- Generates attach_id as random UUID bytes\n- Does NOT support transactions (returns None)\n\nData structures:\n- _catalogs: dict[str, CatalogData]\n- _attachments: dict[AttachId, str] # attach_id -\u003e catalog_name\n\nCreate example worker:\n```python\nclass InMemoryCatalogWorker(Worker):\n catalog_interface = InMemoryCatalog\n```\n\nAdd entry point: vgi-example-catalog-worker","status":"in_progress","priority":2,"issue_type":"task","created_at":"2026-01-05T19:27:27.604912-05:00","created_by":"rusty","updated_at":"2026-01-05T19:59:13.842949-05:00","dependencies":[{"issue_id":"vgi-python-eg7","depends_on_id":"vgi-python-085","type":"blocks","created_at":"2026-01-05T19:27:50.87322-05:00","created_by":"rusty"}]}
{"id":"vgi-python-f5z","title":"Create vgi/catalog/storage.py - Catalog persistence","description":"Create storage layer for catalog attach_id and transaction_id persistence.\n\nFiles to create:\n- vgi/catalog/storage.py\n\nCatalogStorage protocol (similar to FunctionStorage):\n- attach_put(attach_id, catalog_name, options) -\u003e None\n- attach_get(attach_id) -\u003e tuple[str, dict] | None\n- attach_delete(attach_id) -\u003e None\n- attach_list() -\u003e list[AttachId]\n\n- transaction_put(transaction_id, attach_id, state) -\u003e None\n- transaction_get(transaction_id) -\u003e tuple[AttachId, bytes] | None\n- transaction_delete(transaction_id) -\u003e None\n\nCatalogStorageSqlite implementation:\n- Default location: ~/.state/vgi/vgi_catalog.db\n- WAL mode for concurrent access\n- Schema:\n CREATE TABLE catalog_attachments (\n attach_id BLOB PRIMARY KEY,\n catalog_name TEXT NOT NULL,\n options TEXT, -- JSON\n created_at REAL DEFAULT (julianday('now'))\n )\n CREATE TABLE catalog_transactions (\n transaction_id BLOB PRIMARY KEY,\n attach_id BLOB NOT NULL,\n state BLOB,\n created_at REAL DEFAULT (julianday('now'))\n )\n\nInclude cleanup strategies for stale attachments/transactions.","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-05T19:18:04.531387-05:00","created_by":"rusty","updated_at":"2026-01-05T19:21:50.073983-05:00","closed_at":"2026-01-05T19:21:50.073983-05:00","close_reason":"User requested closure","dependencies":[{"issue_id":"vgi-python-f5z","depends_on_id":"vgi-python-tw7","type":"blocks","created_at":"2026-01-05T19:18:44.194468-05:00","created_by":"rusty"}]}
{"id":"vgi-python-fd2","title":"Create vgi/catalog/serialization.py - Arrow serialization","description":"Create Arrow IPC serialization for all catalog types.\n\nFiles to create:\n- vgi/catalog/serialization.py\n\nArrow schemas for:\n- CatalogAttachResult: attach_id, supports_transactions, supports_time_travel, catalog_version_frozen, catalog_version\n- SchemaInfo: attach_id, name, is_default, comment, tags\n- TableInfo: name, schema_name, columns, primary_key_columns, not_null_constraints, unique_constraints, check_constraints, comment, tags\n- ViewInfo: name, schema_name, definition, comment, tags\n- FunctionInfo: name, schema_name, function_type, arguments, output_schema, comment, tags\n- ScanFunctionResult: function_name, max_processes, invocation_id\n\nFunctions:\n- serialize_\u003ctype\u003e() -\u003e bytes for each type\n- deserialize_\u003ctype\u003e(batch) -\u003e Type for each type\n- Arrow schema constants for each type\n\nSerialization convention:\n- Single-row batches for scalar returns\n- Multi-row batches for streaming (Iterable returns)\n- None = 0-row/0-column batch\n- Empty list = 0-row batch with schema\n\nInclude round-trip serialization tests for all types.","status":"closed","priority":1,"issue_type":"task","created_at":"2026-01-05T19:17:15.404739-05:00","created_by":"rusty","updated_at":"2026-01-05T19:21:50.068663-05:00","closed_at":"2026-01-05T19:21:50.068663-05:00","close_reason":"User requested closure","dependencies":[{"issue_id":"vgi-python-fd2","depends_on_id":"vgi-python-tw7","type":"blocks","created_at":"2026-01-05T19:18:36.318762-05:00","created_by":"rusty"}]}
{"id":"vgi-python-g1m","title":"Use sentinel type pattern instead of Any for _MISSING in arguments.py","notes":"Line 33: _MISSING: Any = object()\n\nReplace with proper sentinel type pattern:\n```python\nfrom typing import Final\n\nclass _Missing:\n __slots__ = ()\n def __repr__(self) -\u003e str:\n return '\u003cMISSING\u003e'\n\nMISSING: Final = _Missing()\n```\n\nThis removes 1 Any and provides better type safety for default value checking.\nPart of 26.89% imprecision in arguments.py (59 Anys total).","status":"closed","priority":4,"issue_type":"task","created_at":"2026-01-04T22:19:50.079174-05:00","created_by":"rusty","updated_at":"2026-01-04T22:35:56.508153-05:00","closed_at":"2026-01-04T22:35:56.508153-05:00","close_reason":"Replaced _MISSING: Any = object() with proper _MissingType sentinel class. Improves type safety and removes 1 Any."}
Expand Down
1 change: 1 addition & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ dev = ["mypy", "pyarrow-stubs", "pytest", "pytest-cov", "pytest-mypy", "pytest-r
[project.scripts]
vgi-client = "vgi.client.cli:main"
vgi-example-worker = "vgi.examples.worker:main"
vgi-example-catalog-worker = "vgi.examples.catalog:main"

[build-system]
requires = ["hatchling"]
Expand Down
Loading
Loading