Skip to content

Add Arrow-based argument specification serialization#11

Merged
rustyconover merged 2 commits into
mainfrom
feature/argument-spec-serialization
Jan 5, 2026
Merged

Add Arrow-based argument specification serialization#11
rustyconover merged 2 commits into
mainfrom
feature/argument-spec-serialization

Conversation

@rustyconover

Copy link
Copy Markdown
Contributor

Summary

  • Implement serialization/deserialization of function argument specifications using Apache Arrow schemas
  • Enable functions to describe their argument signatures (types, positions, special markers) for IPC transmission and DuckDB function registration
  • Use a single Arrow schema format where positional args come first, named args follow with metadata markers

Changes

  • vgi/argument_spec.py - Core module with ArgumentSpec dataclass and serialization functions
  • tests/test_argument_spec.py - Comprehensive tests (43 passing test cases)
  • docs/argument-serialization.md - LLM-friendly documentation
  • vgi/__init__.py - Export new public API

New Public API

from vgi import (
    ArgumentSpec,              # Dataclass for argument specifications
    argument_specs_to_schema,  # Convert specs to Arrow schema  
    schema_to_argument_specs,  # Convert schema back to specs
)

Design

Uses a single Arrow schema where:

  • Field order = position index for positional args
  • Named args marked with {b"vgi_arg": b"named"} metadata
  • Special types use metadata: vgi_type=table, vgi_type=any, vgi_varargs=true

Test plan

  • All 43 tests pass including mypy type checking
  • Ruff lint and format pass
  • Import verification works

🤖 Generated with Claude Code

rustyconover and others added 2 commits January 5, 2026 11:46
Implement serialization/deserialization of function argument specifications
using Apache Arrow schemas. This enables functions to describe their argument
signatures (types, positions, special markers) in a format that can be
transmitted over IPC and understood by DuckDB for function registration.

Key features:
- Single Arrow schema format with positional args first, named args after
- Metadata markers for special types (TableInput, AnyArrow, varargs)
- Exact Arrow type preservation through schema serialization
- Extract specs from function classes with Arg descriptors

New public API:
- ArgumentSpec: Dataclass for argument specifications
- argument_specs_to_schema(): Convert specs to Arrow schema
- schema_to_argument_specs(): Convert schema back to specs
- extract_argument_specs(): Extract from function class

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@rustyconover rustyconover merged commit b44d797 into main Jan 5, 2026
0 of 16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant