Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions .beads/issues.jsonl
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
{"id":"vgi-python-67w","title":"Create example function using DuckDB settings","description":"Create an example function that demonstrates using DuckDB settings to determine its output.\n\nRequirements:\n- Function declares required_settings in Meta\n- Output schema depends on a setting value (e.g., include extra column based on setting)\n- Clear documentation showing the pattern\n\nExample ideas:\n1. TimezoneAwareFunction: Output includes timezone info based on 'timezone' setting\n2. VerboseOutput: Adds debug columns when 'debug_mode' setting is true\n3. NumericPrecision: Uses 'numeric_precision' to determine output type precision\n\nAdd to vgi/examples/ and register in ExampleWorker.","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-04T13:05:48.503681-05:00","created_by":"rusty","updated_at":"2026-01-04T13:22:23.779895-05:00","closed_at":"2026-01-04T13:22:23.779895-05:00","close_reason":"Added SettingsAwareFunction example","dependencies":[{"issue_id":"vgi-python-67w","depends_on_id":"vgi-python-c2b","type":"blocks","created_at":"2026-01-04T13:06:13.865474-05:00","created_by":"rusty"},{"issue_id":"vgi-python-67w","depends_on_id":"vgi-python-ivf","type":"blocks","created_at":"2026-01-04T13:06:13.890269-05:00","created_by":"rusty"},{"issue_id":"vgi-python-67w","depends_on_id":"vgi-python-bqb","type":"blocks","created_at":"2026-01-04T13:06:13.912531-05:00","created_by":"rusty"},{"issue_id":"vgi-python-67w","depends_on_id":"vgi-python-a99","type":"blocks","created_at":"2026-01-04T13:06:13.936552-05:00","created_by":"rusty"},{"issue_id":"vgi-python-67w","depends_on_id":"vgi-python-j4t","type":"blocks","created_at":"2026-01-04T13:06:13.958494-05:00","created_by":"rusty"}]}
{"id":"vgi-python-6kr","title":"Test RowCountMismatchError when output exceeds input rows","notes":"Coverage: 86% in vgi/scalar_function.py. Missing tests for:\n- Lines 134-142: Error message when output has MORE rows than input\n\nCurrent tests cover when output \u003c input but not output \u003e input.\nNeed a test that returns an array with more elements than input rows.","status":"closed","priority":3,"issue_type":"task","created_at":"2026-01-04T22:15:26.097532-05:00","created_by":"rusty","updated_at":"2026-01-04T22:32:00.720462-05:00","closed_at":"2026-01-04T22:32:00.720462-05:00","close_reason":"Added test for RowCountMismatchError when output exceeds input. Coverage improved from 86% to 93%."}
{"id":"vgi-python-6o0","title":"Consolidate _OutputComplete classes into shared module","description":"Three nearly identical _OutputComplete classes exist in scalar_function.py:168-197 (_ScalarOutputComplete), table_function.py:136-175 (_OutputComplete), and table_in_out_function.py:356-400 (_OutputComplete). All are frozen dataclasses with batch field, log_message field, and from_process_result() classmethod. Extract to shared module (e.g., vgi/protocol_types.py) with a single parameterized class.","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-04T20:06:40.893139-05:00","created_by":"rusty","updated_at":"2026-01-04T21:18:34.529683-05:00","closed_at":"2026-01-04T21:18:34.529683-05:00","close_reason":"PR #5 created: https://github.com/Query-farm/vgi-python/pull/5"}
{"id":"vgi-python-790","title":"Add slots=True to ArgumentSpec dataclass","description":"ArgumentSpec is a frozen dataclass but doesn't use slots=True. Adding slots=True would reduce memory footprint and improve attribute access speed, which matters if many specs are created during introspection.","status":"open","priority":4,"issue_type":"task","created_at":"2026-01-05T11:51:20.675386-05:00","created_by":"rusty","updated_at":"2026-01-05T11:51:20.675386-05:00"}
{"id":"vgi-python-790","title":"Add slots=True to ArgumentSpec dataclass","description":"ArgumentSpec is a frozen dataclass but doesn't use slots=True. Adding slots=True would reduce memory footprint and improve attribute access speed, which matters if many specs are created during introspection.","status":"closed","priority":4,"issue_type":"task","created_at":"2026-01-05T11:51:20.675386-05:00","created_by":"rusty","updated_at":"2026-01-05T12:02:54.104187-05:00","closed_at":"2026-01-05T12:02:54.104187-05:00","close_reason":"Closed"}
{"id":"vgi-python-79e","title":"Unify ProtocolInput classes with shared base","description":"ProtocolInput classes in scalar_function.py:151-166 and table_in_out_function.py:109-142 have similar structure with batch and metadata fields. The table_in_out version adds is_finalize logic. Create shared base ProtocolInput in protocol_types.py with table_in_out extending it.","status":"closed","priority":3,"issue_type":"task","created_at":"2026-01-04T20:06:41.31917-05:00","created_by":"rusty","updated_at":"2026-01-04T21:53:26.965345-05:00","closed_at":"2026-01-04T21:53:26.965345-05:00","close_reason":"PR #9 created - unified ProtocolInput with shared base in protocol_types.py"}
{"id":"vgi-python-8ra","title":"Implement Arrow-based argument specification serialization","description":"## Overview\n\nImplement serialization and deserialization of function argument specifications using Apache Arrow schemas. This enables functions to describe their argument signatures (types, positions, special markers) in a format that can be transmitted over IPC and understood by DuckDB for function registration.\n\n## Design\n\nUses a **single Arrow schema** where:\n- Positional arguments come first (field order = position index)\n- Named arguments follow (marked with `vgi_arg=named` metadata)\n- Special types (TableInput, AnyArrow, varargs) use field metadata markers\n\n## Key Components\n\n1. `ArgumentSpec` dataclass - represents one argument's specification\n2. `argument_specs_to_schema()` - convert specs to Arrow schema\n3. `schema_to_argument_specs()` - convert schema back to specs\n4. `extract_argument_specs()` - extract specs from function class Arg descriptors\n\n## Metadata Keys\n\n| Key | Value | Meaning |\n|-----|-------|---------|\n| `vgi_arg` | `named` | Named argument (not positional) |\n| `vgi_type` | `table` | Receives table input (Arg[TableInput]) |\n| `vgi_type` | `any` | Accepts any Arrow type (Arg[AnyArrow]) |\n| `vgi_varargs` | `true` | Collects remaining positional args |\n\n## References\n\n- Plan file: `.claude/plans/purrfect-foraging-nygaard.md`\n- Arguments module: `vgi/arguments.py`","status":"closed","priority":2,"issue_type":"feature","created_at":"2026-01-05T11:18:01.05631-05:00","created_by":"rusty","updated_at":"2026-01-05T11:34:12.712096-05:00","closed_at":"2026-01-05T11:34:12.712096-05:00","close_reason":"Implemented Arrow-based argument specification serialization with tests and documentation"}
{"id":"vgi-python-a99","title":"Add settings accessor to function base classes","description":"Add a property to access DuckDB settings values in function implementations.\n\nChanges needed:\n- Add 'settings: dict[str, str]' property to Function base class\n- Property should return self.invocation.duckdb_settings or empty dict\n- Add convenience method like 'get_setting(name, default=None)'\n- Update ScalarFunction, TableFunctionGenerator, TableInOutFunction\n\nExample usage in function:\ndef compute(self, batch):\n tz = self.get_setting('timezone', 'UTC')\n # or\n tz = self.settings.get('timezone', 'UTC')","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-04T13:05:48.221602-05:00","created_by":"rusty","updated_at":"2026-01-04T13:20:41.171991-05:00","closed_at":"2026-01-04T13:20:41.171991-05:00","close_reason":"Implementation complete, all tests pass","dependencies":[{"issue_id":"vgi-python-a99","depends_on_id":"vgi-python-aad","type":"blocks","created_at":"2026-01-04T13:06:13.738212-05:00","created_by":"rusty"}]}
Expand All @@ -36,10 +36,10 @@
{"id":"vgi-python-j8a","title":"Investigate named argument field name ambiguity","description":"For named arguments, position is set to field.name (the SQL key). But there's potential ambiguity between the Python attribute name and the named argument key if they could differ. Currently they're the same, but if ArgumentSpec.name (attribute) ever differs from ArgumentSpec.position (key), the schema only preserves one. Investigate whether this is a real concern or document the assumption that they're always equal.","status":"open","priority":4,"issue_type":"task","created_at":"2026-01-05T11:51:20.257539-05:00","created_by":"rusty","updated_at":"2026-01-05T11:51:20.257539-05:00"}
{"id":"vgi-python-j9k","title":"Add protocol types for IPC stream writers in cli.py","notes":"Line 53: self._writer: Any = None\n\nCould define a Protocol type for the IPC stream writer interface:\n```python\nclass IPCWriter(Protocol):\n def write_batch(self, batch: pa.RecordBatch) -\u003e None: ...\n def close(self) -\u003e None: ...\n```\n\nPart of 14.17% imprecision in cli.py (34 Anys total).","status":"closed","priority":4,"issue_type":"task","created_at":"2026-01-04T22:19:50.31711-05:00","created_by":"rusty","updated_at":"2026-01-04T22:37:01.488788-05:00","closed_at":"2026-01-04T22:37:01.488788-05:00","close_reason":"Replaced _writer: Any with _writer: pq.ParquetWriter | None. Removes 1 Any and provides proper type information."}
{"id":"vgi-python-jrf","title":"Add varargs parameter to Arg descriptor","description":"In vgi/arguments.py:\n- Add varargs: bool = False to Arg.__init__ and __slots__\n- Update _resolve() to collect positional[position:] when varargs=True\n- Validate at least 1 value provided\n- Update _validate() to validate each element in tuple\n- Add Arguments.get_varargs(start, type=None) method\n- Update __repr__ to show varargs flag","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-05T10:49:20.012964-05:00","created_by":"rusty","updated_at":"2026-01-05T10:55:22.479344-05:00","closed_at":"2026-01-05T10:55:22.479344-05:00","close_reason":"Implemented varargs parameter in Arg descriptor with get_varargs() method and _validate_single()"}
{"id":"vgi-python-k7x","title":"Use Mapping instead of dict in extract_argument_specs signature","description":"The arg_types parameter in extract_argument_specs() is typed as dict[str, pa.DataType]. Using Mapping[str, pa.DataType] from collections.abc would be more flexible, accepting any mapping type.","status":"open","priority":4,"issue_type":"task","created_at":"2026-01-05T11:51:21.021496-05:00","created_by":"rusty","updated_at":"2026-01-05T11:51:21.021496-05:00"}
{"id":"vgi-python-k7x","title":"Use Mapping instead of dict in extract_argument_specs signature","description":"The arg_types parameter in extract_argument_specs() is typed as dict[str, pa.DataType]. Using Mapping[str, pa.DataType] from collections.abc would be more flexible, accepting any mapping type.","status":"closed","priority":4,"issue_type":"task","created_at":"2026-01-05T11:51:21.021496-05:00","created_by":"rusty","updated_at":"2026-01-05T12:03:51.771301-05:00","closed_at":"2026-01-05T12:03:51.771301-05:00","close_reason":"Closed"}
{"id":"vgi-python-kz4","title":"Rename TableInOutGeneratorFunction to TableInOutGenerator for consistency","description":"Naming inconsistency: TableFunctionGenerator uses *Generator suffix, but TableInOutGeneratorFunction uses *GeneratorFunction suffix. Rename TableInOutGeneratorFunction to TableInOutGenerator for consistency. Also consider renaming ScalarFunctionGenerator if needed.","status":"closed","priority":3,"issue_type":"task","created_at":"2026-01-04T20:06:41.581028-05:00","created_by":"rusty","updated_at":"2026-01-04T21:43:58.141038-05:00","closed_at":"2026-01-04T21:43:58.141038-05:00","close_reason":"PR #7 created: https://github.com/Query-farm/vgi-python/pull/7"}
{"id":"vgi-python-l1u","title":"Consider custom __repr__ for ArgumentSpec","description":"The default dataclass __repr__ includes the full Arrow type repr which can be verbose. Consider a custom __repr__ that's more concise for debugging, e.g., 'ArgumentSpec(name=\"count\", pos=0, type=int64)' instead of showing the full pa.DataType object.","status":"open","priority":4,"issue_type":"task","created_at":"2026-01-05T11:51:21.415976-05:00","created_by":"rusty","updated_at":"2026-01-05T11:51:21.415976-05:00"}
{"id":"vgi-python-lec","title":"Add test coverage for testing.py helper edge cases","notes":"Coverage: 89% in vgi/testing.py. Missing tests for:\n- Lines 421-422, 450-451: StopIteration handling in _process_batch\n- Lines 468-472: FINISHED status during data phase\n- Lines 485-486, 502-503: _finalize edge cases\n\nLow priority since these are test helpers.","status":"open","priority":4,"issue_type":"task","created_at":"2026-01-04T22:15:34.006563-05:00","created_by":"rusty","updated_at":"2026-01-04T22:16:18.592782-05:00"}
{"id":"vgi-python-lec","title":"Add test coverage for testing.py helper edge cases","notes":"Coverage: 89% in vgi/testing.py. Missing tests for:\n- Lines 421-422, 450-451: StopIteration handling in _process_batch\n- Lines 468-472: FINISHED status during data phase\n- Lines 485-486, 502-503: _finalize edge cases\n\nLow priority since these are test helpers.","status":"closed","priority":4,"issue_type":"task","created_at":"2026-01-04T22:15:34.006563-05:00","created_by":"rusty","updated_at":"2026-01-05T12:07:48.026786-05:00","closed_at":"2026-01-05T12:07:48.026786-05:00","close_reason":"Closed"}
{"id":"vgi-python-lzc","title":"Extract duplicated sort_key function in argument_spec","description":"The sort_key function is duplicated at lines 139-142 and 309-312 in argument_spec.py. Extract it to a module-level function to follow DRY principles.","status":"closed","priority":3,"issue_type":"task","created_at":"2026-01-05T11:51:19.141041-05:00","created_by":"rusty","updated_at":"2026-01-05T11:55:22.535816-05:00","closed_at":"2026-01-05T11:55:22.535816-05:00","close_reason":"Closed"}
{"id":"vgi-python-m45","title":"Create tests/test_argument_spec.py","description":"## Overview\n\nCreate comprehensive tests for the argument specification serialization module.\n\n## File Location\n\n`tests/test_argument_spec.py`\n\n## Test Classes and Cases\n\n### TestArgumentSpecToSchema\n\nTest converting ArgumentSpec objects to Arrow schema.\n\n#### test_positional_arguments_preserve_order\n- Create specs with positions 0, 1, 2\n- Convert to schema\n- Verify field order matches position order\n- Verify field types are preserved\n\n#### test_named_arguments_have_metadata\n- Create spec with position='key' (named)\n- Convert to schema\n- Verify field has `vgi_arg=named` metadata\n\n#### test_mixed_positional_and_named\n- Create mix of positional (0, 1) and named ('format', 'verbose') specs\n- Convert to schema\n- Verify positional come first, then named\n- Verify named have correct metadata\n\n#### test_table_input_uses_null_type\n- Create spec with is_table_input=True\n- Convert to schema\n- Verify field type is pa.null()\n- Verify field has `vgi_type=table` metadata\n\n#### test_any_type_uses_null_type\n- Create spec with is_any_type=True\n- Convert to schema\n- Verify field type is pa.null()\n- Verify field has `vgi_type=any` metadata\n\n#### test_varargs_has_metadata\n- Create spec with is_varargs=True and arrow_type=pa.int64()\n- Convert to schema\n- Verify field type is pa.int64() (element type preserved)\n- Verify field has `vgi_varargs=true` metadata\n\n### TestSchemaToArgumentSpecs\n\nTest converting Arrow schema back to ArgumentSpec objects.\n\n#### test_positional_arguments_from_schema\n- Create schema with 3 fields (no metadata)\n- Convert to specs\n- Verify positions are 0, 1, 2\n\n#### test_named_arguments_from_metadata\n- Create schema with `vgi_arg=named` metadata on fields\n- Convert to specs\n- Verify position is field name string\n\n#### test_table_input_detected\n- Create schema with `vgi_type=table` metadata\n- Convert to specs\n- Verify is_table_input=True\n\n#### test_any_type_detected\n- Create schema with `vgi_type=any` metadata\n- Convert to specs\n- Verify is_any_type=True\n\n#### test_varargs_detected\n- Create schema with `vgi_varargs=true` metadata\n- Convert to specs\n- Verify is_varargs=True\n\n### TestRoundTrip\n\nTest that specs survive serialization round-trip.\n\n#### test_complex_arrow_types_preserved\nTest each of these types round-trips correctly:\n- pa.int64(), pa.float32(), pa.utf8()\n- pa.list_(pa.float64())\n- pa.struct([pa.field('a', pa.int32()), pa.field('b', pa.string())])\n- pa.map_(pa.string(), pa.int64())\n- pa.decimal128(10, 2)\n- pa.timestamp('us', tz='UTC')\n\n#### test_full_function_signature_roundtrip\n- Create specs matching a realistic function:\n - count: int, position 0\n - data: TableInput, position 1\n - extra: float varargs, position 2\n - format: str, named 'format'\n- Convert to schema, serialize to bytes, deserialize, convert back to specs\n- Verify all specs match original\n\n### TestExtractArgumentSpecs\n\nTest extracting specs from function classes.\n\n#### test_extract_from_simple_function\n- Define function class with Arg descriptors\n- Call extract_argument_specs with arg_types dict\n- Verify specs match descriptors\n\n#### test_extract_table_input\n- Define function with Arg[TableInput]\n- Extract specs\n- Verify is_table_input=True\n\n#### test_extract_any_arrow\n- Define function with Arg[AnyArrow]\n- Extract specs\n- Verify is_any_type=True\n\n#### test_extract_varargs\n- Define function with Arg[int](2, varargs=True)\n- Extract specs\n- Verify is_varargs=True\n\n### TestEdgeCases\n\n#### test_empty_schema\n- Convert empty list of specs to schema\n- Verify empty schema works\n- Convert back, verify empty list\n\n#### test_only_named_arguments\n- Create specs with only named arguments (no positional)\n- Round-trip and verify\n\n#### test_only_positional_arguments\n- Create specs with only positional arguments (no named)\n- Round-trip and verify\n\n## Test Utilities\n\nConsider creating fixtures for common patterns:\n- `make_spec()` helper for creating ArgumentSpec\n- Sample function classes for extraction tests","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-05T11:18:53.312911-05:00","created_by":"rusty","updated_at":"2026-01-05T11:32:35.580879-05:00","closed_at":"2026-01-05T11:32:35.580879-05:00","close_reason":"Created comprehensive tests with 43 passing test cases","dependencies":[{"issue_id":"vgi-python-m45","depends_on_id":"vgi-python-cd0","type":"blocks","created_at":"2026-01-05T11:19:30.779207-05:00","created_by":"rusty"}]}
{"id":"vgi-python-odi","title":"Change max_processes from method to property in Function hierarchy","description":"Refactor max_processes from a method to a property across the Function class hierarchy (Function, ScalarFunction, TableFunctionGenerator, TableInOutFunction, etc.). This makes the API more consistent since max_processes is effectively a constant per function class and properties are more idiomatic for such values.","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-04T11:25:29.750648-05:00","created_by":"rusty","updated_at":"2026-01-04T11:50:57.566545-05:00","closed_at":"2026-01-04T11:50:57.566545-05:00","close_reason":"Closed"}
Expand Down
Loading
Loading