Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .beads/issues.jsonl
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
{"id":"vgi-python-085","title":"Add serialize/deserialize methods to catalog dataclasses","description":"Add Arrow IPC serialization directly to the dataclasses in vgi/catalog/catalog_interface.py.\n\nAdd to each dataclass:\n- serialize() -\u003e bytes method\n- @classmethod deserialize(batch: pa.RecordBatch) -\u003e Self method\n- Arrow schema class variable for each type\n\nDataclasses to update:\n- CatalogAttachResult\n- SchemaInfo \n- TableInfo\n- ViewInfo\n- FunctionInfo\n- ScanFunctionResult\n\nSerialization rules from plan:\n- Single-row batches for scalar returns\n- Multi-row batches for streaming (Iterable returns)\n- None = 0-row/0-column batch\n- Column names match field names exactly\n- SerializedSchema fields use pa.binary()\n- tags fields use pa.map_(pa.string(), pa.string())\n\nAlso create vgi/catalog/__init__.py with package exports.","status":"open","priority":1,"issue_type":"task","created_at":"2026-01-05T19:26:40.362177-05:00","created_by":"rusty","updated_at":"2026-01-05T19:26:40.362177-05:00"}
{"id":"vgi-python-085","title":"Add serialize/deserialize methods to catalog dataclasses","description":"Add Arrow IPC serialization directly to the dataclasses in vgi/catalog/catalog_interface.py.\n\nAdd to each dataclass:\n- serialize() -\u003e bytes method\n- @classmethod deserialize(batch: pa.RecordBatch) -\u003e Self method\n- Arrow schema class variable for each type\n\nDataclasses to update:\n- CatalogAttachResult\n- SchemaInfo \n- TableInfo\n- ViewInfo\n- FunctionInfo\n- ScanFunctionResult\n\nSerialization rules from plan:\n- Single-row batches for scalar returns\n- Multi-row batches for streaming (Iterable returns)\n- None = 0-row/0-column batch\n- Column names match field names exactly\n- SerializedSchema fields use pa.binary()\n- tags fields use pa.map_(pa.string(), pa.string())\n\nAlso create vgi/catalog/__init__.py with package exports.","status":"in_progress","priority":1,"issue_type":"task","created_at":"2026-01-05T19:26:40.362177-05:00","created_by":"rusty","updated_at":"2026-01-05T19:29:02.815022-05:00"}
{"id":"vgi-python-0fe","title":"Add is_varargs to ParameterInfo and metadata extraction","description":"In vgi/metadata.py:\n- Add is_varargs: bool = False to ParameterInfo\n- Update to_dict() and from_dict()\n- Add is_varargs field to _PARAMETER_STRUCT for Arrow serialization\n- Extract varargs flag in extract_parameters()\n- Add _validate_varargs() with rules:\n - Only one varargs parameter allowed\n - Must be positional (not named)\n - Must be last positional (before TableInput if present)\n - Cannot have default value","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-05T10:49:20.141375-05:00","created_by":"rusty","updated_at":"2026-01-05T10:58:21.242603-05:00","closed_at":"2026-01-05T10:58:21.242603-05:00","close_reason":"Added is_varargs to ParameterInfo, _PARAMETER_STRUCT, extract_parameters(), and _validate_varargs()","dependencies":[{"issue_id":"vgi-python-0fe","depends_on_id":"vgi-python-jrf","type":"blocks","created_at":"2026-01-05T10:49:26.421664-05:00","created_by":"rusty"}]}
{"id":"vgi-python-0hr","title":"Remove redundant InitInputType class attribute","description":"InitInputType class attribute duplicates the generic type parameter: 'class ScalarFunctionGenerator(Function[FunctionInitInput])' already specifies the type, but 'InitInputType = FunctionInitInput' repeats it. Investigate using get_type_hints or __orig_bases__ to infer the type and remove the redundant attribute.","status":"closed","priority":3,"issue_type":"task","created_at":"2026-01-04T20:06:53.780529-05:00","created_by":"rusty","updated_at":"2026-01-04T22:00:40.221423-05:00","closed_at":"2026-01-04T22:00:40.221423-05:00","close_reason":"PR #10 created - uses _get_init_input_type() to infer type from generic parameter"}
{"id":"vgi-python-1s5","title":"Move distributed state management to optional mixin","description":"The Function base class in function.py includes ~200 lines for distributed state management (store_state, collect_states, enqueue_work, dequeue_work, work queue storage). Not all functions need this. Extract to DistributedStateMixin that functions can opt into, keeping Function base class simpler for basic use cases.","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-04T20:06:53.606614-05:00","created_by":"rusty","updated_at":"2026-01-04T21:22:09.772825-05:00","closed_at":"2026-01-04T21:22:09.772825-05:00","close_reason":"Analysis complete: extraction not recommended. The distributed state methods are tightly coupled with execution_identifier and storage, which are used by core initialization methods. Extraction would require moving initialize_global_state/load_global_state to the mixin, breaking the protocol and requiring multiple inheritance. Current API is already opt-in (just don't call the methods) and well-documented."}
Expand Down
65 changes: 65 additions & 0 deletions vgi/catalog/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
"""VGI Catalog Interface for exposing catalogs, schemas, tables, and views.

This module provides the abstract base class and data types for implementing
catalog interfaces in VGI workers, enabling DuckDB ATTACH support.

Example:
from vgi.catalog import CatalogInterface, CatalogAttachResult, SchemaInfo

class MyCatalog(CatalogInterface):
def catalogs(self) -> Iterable[str]:
return ["my_catalog"]

def catalog_attach(self, *, name: str, options: dict) -> CatalogAttachResult:
return CatalogAttachResult(
attach_id=AttachId(b"my-id"),
supports_transactions=False,
supports_time_travel=False,
catalog_version_frozen=True,
catalog_version=1,
)
# ... implement other abstract methods

"""

from vgi.catalog.catalog_interface import (
AttachId,
CatalogAttachResult,
CatalogInterface,
CatalogObject,
CatalogSchemaObject,
FunctionInfo,
FunctionType,
OnConflict,
ReadOnlyCatalogInterface,
ScanFunctionResult,
SchemaInfo,
SerializedSchema,
SqlExpression,
TableInfo,
TransactionId,
ViewInfo,
)

__all__ = [
# Type aliases
"AttachId",
"TransactionId",
"SerializedSchema",
"SqlExpression",
# Enums
"FunctionType",
"OnConflict",
# Data classes
"CatalogAttachResult",
"CatalogObject",
"CatalogSchemaObject",
"SchemaInfo",
"TableInfo",
"ViewInfo",
"FunctionInfo",
"ScanFunctionResult",
# Interfaces
"CatalogInterface",
"ReadOnlyCatalogInterface",
]
Loading
Loading