Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .beads/issues.jsonl
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@
{"id":"vgi-python-c2b","title":"Add duckdb_settings field to Invocation class","description":"Update vgi/invocation.py to add a duckdb_settings field to the Invocation dataclass.\n\nChanges needed:\n- Add 'duckdb_settings: dict[str, str] | None = None' field to Invocation\n- Update serialize() to include settings in Arrow IPC batch\n- Update deserialize() to read settings from Arrow IPC batch\n- Handle None case (no settings requested)\n\nSerialization: Use a struct field with string key-value pairs or a map type.","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-04T13:05:47.765077-05:00","created_by":"rusty","updated_at":"2026-01-04T13:20:41.167817-05:00","closed_at":"2026-01-04T13:20:41.167817-05:00","close_reason":"Implementation complete, all tests pass","dependencies":[{"issue_id":"vgi-python-c2b","depends_on_id":"vgi-python-aad","type":"blocks","created_at":"2026-01-04T13:06:13.664038-05:00","created_by":"rusty"}]}
{"id":"vgi-python-cd0","title":"Create vgi/argument_spec.py module","description":"## Overview\n\nCreate the core module implementing Arrow-based argument specification serialization.\n\n## File Location\n\n`vgi/argument_spec.py`\n\n## Constants to Define\n\n```python\n# Metadata keys (all bytes for Arrow compatibility)\nVGI_ARG_KEY = b\"vgi_arg\"\nVGI_ARG_NAMED = b\"named\"\n\nVGI_TYPE_KEY = b\"vgi_type\"\nVGI_TYPE_TABLE = b\"table\"\nVGI_TYPE_ANY = b\"any\"\n\nVGI_VARARGS_KEY = b\"vgi_varargs\"\nVGI_VARARGS_TRUE = b\"true\"\n```\n\n## ArgumentSpec Dataclass\n\n```python\n@dataclass(frozen=True)\nclass ArgumentSpec:\n \"\"\"Specification for a single function argument.\"\"\"\n name: str # Python attribute name\n position: int | str # int for positional index, str for named key\n arrow_type: pa.DataType # Arrow type (pa.null() for special types)\n is_table_input: bool = False # Arg[TableInput]\n is_any_type: bool = False # Arg[AnyArrow]\n is_varargs: bool = False # varargs=True\n```\n\n## Functions to Implement\n\n### argument_specs_to_schema(specs: Sequence[ArgumentSpec]) -\u003e pa.Schema\n\nConvert ArgumentSpecs to a single Arrow schema:\n1. Sort specs: positional first (by index), then named\n2. For each spec, create a pa.field with:\n - name = spec.name\n - type = spec.arrow_type (or pa.null() for table/any)\n - metadata = appropriate markers based on flags\n3. Return pa.schema(fields)\n\n### schema_to_argument_specs(schema: pa.Schema) -\u003e list[ArgumentSpec]\n\nConvert schema back to ArgumentSpecs:\n1. Iterate through schema fields in order\n2. Track position index (increments for non-named args)\n3. Check field metadata for markers:\n - `vgi_arg=named` -\u003e position is field name string\n - `vgi_type=table` -\u003e is_table_input=True\n - `vgi_type=any` -\u003e is_any_type=True\n - `vgi_varargs=true` -\u003e is_varargs=True\n4. Return list of ArgumentSpec\n\n### extract_argument_specs(cls: type, arg_types: dict[str, pa.DataType]) -\u003e list[ArgumentSpec]\n\nExtract specs from a function class with Arg descriptors:\n1. Walk class MRO to find all Arg descriptors (like extract_parameters in metadata.py)\n2. For each Arg descriptor:\n - Get name from attribute name\n - Get position from arg.position\n - Get arrow_type from arg_types dict\n - Check type hints for TableInput/AnyArrow\n - Check arg.varargs flag\n3. Sort and return list\n\n## Dependencies\n\n- Import `Arg`, `TableInput`, `AnyArrow` from `vgi.arguments`\n- Reference `extract_parameters()` pattern in `vgi/metadata.py`","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-05T11:18:32.777241-05:00","created_by":"rusty","updated_at":"2026-01-05T11:28:07.227452-05:00","closed_at":"2026-01-05T11:28:07.227452-05:00","close_reason":"Created vgi/argument_spec.py with ArgumentSpec dataclass and serialization functions","dependencies":[{"issue_id":"vgi-python-cd0","depends_on_id":"vgi-python-8ra","type":"blocks","created_at":"2026-01-05T11:19:30.743936-05:00","created_by":"rusty"}]}
{"id":"vgi-python-ckg","title":"Add AnyValue sentinel class to vgi/arguments.py","description":"Add AnyValue class similar to TableInput, export in __all__","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-05T10:41:41.392694-05:00","created_by":"rusty","updated_at":"2026-01-05T11:05:38.37392-05:00","closed_at":"2026-01-05T11:05:38.37392-05:00","close_reason":"Added AnyArrow sentinel class to arguments.py","dependencies":[{"issue_id":"vgi-python-ckg","depends_on_id":"vgi-python-awm","type":"blocks","created_at":"2026-01-05T10:41:52.658405-05:00","created_by":"rusty"}]}
{"id":"vgi-python-coi","title":"Update extract_argument_specs() to remove arg_types parameter","description":"In vgi/argument_spec.py:\n1. Remove arg_types parameter from function signature\n2. Update arrow_type resolution logic:\n - Use arg.arrow_type if explicitly set\n - Infer from Python type hint using PYTHON_TO_ARROW\n - Handle TableInput/AnyArrow → pa.null()\n - Warn and default to pa.null() for unknown types\n3. Import PYTHON_TO_ARROW from vgi.arguments","status":"open","priority":2,"issue_type":"task","created_at":"2026-01-05T15:44:38.141157-05:00","created_by":"rusty","updated_at":"2026-01-05T15:44:38.141157-05:00","dependencies":[{"issue_id":"vgi-python-coi","depends_on_id":"vgi-python-cvj","type":"blocks","created_at":"2026-01-05T15:45:13.831745-05:00","created_by":"rusty"},{"issue_id":"vgi-python-coi","depends_on_id":"vgi-python-dv0","type":"blocks","created_at":"2026-01-05T15:45:13.864608-05:00","created_by":"rusty"}]}
{"id":"vgi-python-coi","title":"Update extract_argument_specs() to remove arg_types parameter","description":"In vgi/argument_spec.py:\n1. Remove arg_types parameter from function signature\n2. Update arrow_type resolution logic:\n - Use arg.arrow_type if explicitly set\n - Infer from Python type hint using PYTHON_TO_ARROW\n - Handle TableInput/AnyArrow → pa.null()\n - Warn and default to pa.null() for unknown types\n3. Import PYTHON_TO_ARROW from vgi.arguments","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-05T15:44:38.141157-05:00","created_by":"rusty","updated_at":"2026-01-05T15:56:24.271544-05:00","closed_at":"2026-01-05T15:56:24.271544-05:00","close_reason":"PR #20 created","dependencies":[{"issue_id":"vgi-python-coi","depends_on_id":"vgi-python-cvj","type":"blocks","created_at":"2026-01-05T15:45:13.831745-05:00","created_by":"rusty"},{"issue_id":"vgi-python-coi","depends_on_id":"vgi-python-dv0","type":"blocks","created_at":"2026-01-05T15:45:13.864608-05:00","created_by":"rusty"}]}
{"id":"vgi-python-cvj","title":"Add PYTHON_TO_ARROW type mapping to vgi/arguments.py","description":"Add the Python→Arrow type mapping dict after imports:\n```python\nPYTHON_TO_ARROW: dict[type, pa.DataType] = {\n int: pa.int64(),\n str: pa.utf8(),\n float: pa.float64(),\n bool: pa.bool_(),\n bytes: pa.binary(),\n}\n```\nExport in __all__.","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-05T15:44:37.900421-05:00","created_by":"rusty","updated_at":"2026-01-05T15:48:42.422086-05:00","closed_at":"2026-01-05T15:48:42.422086-05:00","close_reason":"PR #18 created"}
{"id":"vgi-python-d73","title":"Create docs/argument-serialization.md","description":"## Overview\n\nCreate LLM-friendly documentation explaining the argument specification serialization format. This document should enable future implementors (human or AI) to understand how function argument signatures are serialized to Arrow schemas.\n\n## File Location\n\n`docs/argument-serialization.md`\n\n## Document Structure\n\n### Title and Purpose\n\nExplain that this document describes how VGI function argument specifications are serialized to Apache Arrow schemas for IPC transmission and DuckDB function registration.\n\n### Quick Reference\n\nA concise summary table showing:\n- Metadata keys and their meanings\n- Special type representations\n\n### Schema Format\n\nExplain the single-schema design:\n1. All arguments are fields in one Arrow schema\n2. Positional arguments come first, in order (field index = position index)\n3. Named arguments follow, marked with metadata\n4. Field name = Python attribute name (or argument key for named)\n5. Field type = exact Arrow type\n\n### Metadata Keys Reference\n\nComplete table of all metadata keys:\n\n| Key | Value | Description |\n|-----|-------|-------------|\n| `vgi_arg` | `named` | Field is a named argument, not positional. The field name is the argument key. |\n| `vgi_type` | `table` | Argument receives streaming table input (Arg[TableInput]). Arrow type is pa.null(). |\n| `vgi_type` | `any` | Argument accepts any Arrow type (Arg[AnyArrow]). Arrow type is pa.null(). |\n| `vgi_varargs` | `true` | Argument collects all remaining positional args. Arrow type is the element type. |\n\n### Special Type Handling\n\nExplain how special argument types are represented:\n\n#### TableInput\n- Arrow type: `pa.null()`\n- Metadata: `{b\"vgi_type\": b\"table\"}`\n- Meaning: This position receives streaming RecordBatches, not a scalar value\n\n#### AnyArrow\n- Arrow type: `pa.null()`\n- Metadata: `{b\"vgi_type\": b\"any\"}`\n- Meaning: Accepts any valid Arrow scalar type at runtime\n\n#### Varargs\n- Arrow type: The element type (e.g., `pa.int64()` for `Arg[int](..., varargs=True)`)\n- Metadata: `{b\"vgi_varargs\": b\"true\"}`\n- Meaning: Collects all remaining positional arguments from this position onwards\n\n### Examples\n\n#### Example 1: Simple Function\n\n```python\nclass MyFunction(TableInOutFunction):\n count = Arg[int](0) # Positional 0\n name = Arg[str](1) # Positional 1\n verbose = Arg[bool](\"verbose\") # Named\n\n# Serializes to:\nschema = pa.schema([\n pa.field(\"count\", pa.int64()),\n pa.field(\"name\", pa.utf8()),\n pa.field(\"verbose\", pa.bool_(), metadata={b\"vgi_arg\": b\"named\"}),\n])\n```\n\n#### Example 2: Function with Table Input\n\n```python\nclass TransformFunction(TableInOutFunction):\n multiplier = Arg[float](0)\n data = Arg[TableInput](1)\n\n# Serializes to:\nschema = pa.schema([\n pa.field(\"multiplier\", pa.float64()),\n pa.field(\"data\", pa.null(), metadata={b\"vgi_type\": b\"table\"}),\n])\n```\n\n#### Example 3: Function with Varargs\n\n```python\nclass SumFunction(TableInOutFunction):\n columns = Arg[str](0, varargs=True)\n\n# Serializes to:\nschema = pa.schema([\n pa.field(\"columns\", pa.utf8(), metadata={b\"vgi_varargs\": b\"true\"}),\n])\n```\n\n#### Example 4: Complex Function\n\n```python\nclass ComplexFunction(TableInOutFunction):\n count = Arg[int](0)\n data = Arg[TableInput](1)\n extra = Arg[float](2, varargs=True)\n format = Arg[str](\"format\")\n threshold = Arg[AnyArrow](\"threshold\")\n\n# Serializes to:\nschema = pa.schema([\n pa.field(\"count\", pa.int64()),\n pa.field(\"data\", pa.null(), metadata={b\"vgi_type\": b\"table\"}),\n pa.field(\"extra\", pa.float64(), metadata={b\"vgi_varargs\": b\"true\"}),\n pa.field(\"format\", pa.utf8(), metadata={b\"vgi_arg\": b\"named\"}),\n pa.field(\"threshold\", pa.null(), metadata={b\"vgi_arg\": b\"named\", b\"vgi_type\": b\"any\"}),\n])\n```\n\n### Serialization Code\n\nShow how to serialize and deserialize:\n\n```python\n# Serialize to bytes\nschema_bytes = schema.serialize().to_pybytes()\n\n# Deserialize from bytes\nschema = pa.ipc.read_schema(pa.py_buffer(schema_bytes))\n```\n\n### Parsing Algorithm\n\nExplain how to parse a schema back to argument specs:\n\n1. Initialize position_index = 0\n2. For each field in schema:\n a. Check if field has `vgi_arg=named` metadata\n b. If named: position = field.name (string)\n c. If positional: position = position_index, then increment position_index\n d. Check for `vgi_type` metadata (table or any)\n e. Check for `vgi_varargs` metadata\n f. Create ArgumentSpec with extracted info\n\n### Not Included\n\nExplicitly state what is NOT serialized:\n- Default values\n- Validation constraints (ge, le, choices, pattern)\n- Documentation strings\n\nThese are Python-side concerns handled by the Arg descriptor at runtime.","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-05T11:19:17.488877-05:00","created_by":"rusty","updated_at":"2026-01-05T11:33:29.168007-05:00","closed_at":"2026-01-05T11:33:29.168007-05:00","close_reason":"Created comprehensive LLM-friendly documentation","dependencies":[{"issue_id":"vgi-python-d73","depends_on_id":"vgi-python-8ra","type":"blocks","created_at":"2026-01-05T11:19:30.820384-05:00","created_by":"rusty"}]}
{"id":"vgi-python-dv0","title":"Add arrow_type parameter to Arg class","description":"In vgi/arguments.py:\n1. Add 'arrow_type' to __slots__\n2. Add parameter: arrow_type: pa.DataType | None = None\n3. Store: self.arrow_type = arrow_type\n4. Update __repr__ to include arrow_type if set","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-05T15:44:38.020395-05:00","created_by":"rusty","updated_at":"2026-01-05T15:50:51.513273-05:00","closed_at":"2026-01-05T15:50:51.513273-05:00","close_reason":"PR #19 created","dependencies":[{"issue_id":"vgi-python-dv0","depends_on_id":"vgi-python-cvj","type":"blocks","created_at":"2026-01-05T15:45:13.696822-05:00","created_by":"rusty"}]}
Expand Down
Loading