Conversation
Updated commentary Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…to-the-table-interface Add additional methods to the Table function
There was a problem hiding this comment.
Pull Request Overview
This PR prepares the v0.3.12 release by updating installation instructions and enhancing the Table API with a more complete interface and additional statistics while adding comprehensive tests.
- Added tests for table CRUD operations in tests/tower/test_tables.py
- Extended the Table API (in src/tower/_tables.py) with new methods (to_polars, upsert, delete) and rows affected tracking
- Updated installation instructions and related configurations (README, _features, pyproject.toml, workflows) to wrap feature strings in quotes
Reviewed Changes
Copilot reviewed 12 out of 13 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/tower/test_tables.py | Added tests for CRUD operations and temp directory handling |
| src/tower/utils/pyarrow.py | Enhanced schema and expression conversion utilities |
| src/tower/pyiceberg.py | Added dynamic dispatch for pyiceberg re-exports |
| src/tower/pyarrow.py | Re-exported objects from pyarrow |
| src/tower/polars.py | Re-exported objects from polars |
| src/tower/_tables.py | Extended Table API with new methods and rows affected stats |
| src/tower/_features.py & src/tower/init.py | Updated installation instructions to use quoted feature names |
| pyproject.toml | Adjusted dev dependency configuration |
| README.md | Updated installation instructions to include quotes |
| .github/workflows/test-python.yml | Updated workflow: restricted OS matrix and adjusted install steps |
Files not reviewed (1)
- .python-version: Language not supported
Comments suppressed due to low confidence (1)
.github/workflows/test-python.yml:26
- Changing the OS matrix from [ubuntu-latest, windows-latest] to only ubuntu-latest reduces cross-platform test coverage. Consider including additional OS environments if supporting cross-platform compatibility is desired.
os: ubuntu-latest
| def get_temp_dir(): | ||
| """Create a temporary directory and return its file:// URL.""" | ||
| # Create a temporary directory that will be automatically cleaned up | ||
| temp_dir = tempfile.TemporaryDirectory() | ||
| abs_path = pathlib.Path(temp_dir.name).absolute() | ||
| file_url = urljoin('file:', pathname2url(str(abs_path))) | ||
|
|
||
| # Return both the URL and the path to the temporary directory | ||
| return file_url, abs_path | ||
|
|
||
|
|
||
| @pytest.fixture | ||
| def in_memory_catalog(): | ||
| file_url, temp_dir = get_temp_dir() | ||
| catalog = InMemoryCatalog("test.in_memory.catalog", warehouse=file_url) | ||
|
|
||
| # Yield the fixture which actually runs the test | ||
| yield catalog | ||
|
|
||
| # Clean up after the catalog | ||
| shutil.rmtree(temp_dir) | ||
|
|
||
|
|
There was a problem hiding this comment.
The temporary directory is created using TemporaryDirectory() but its cleanup is later manually performed with shutil.rmtree. Consider managing the temporary directory entirely with a context manager to avoid potential conflicts with automatic cleanup.
| def get_temp_dir(): | |
| """Create a temporary directory and return its file:// URL.""" | |
| # Create a temporary directory that will be automatically cleaned up | |
| temp_dir = tempfile.TemporaryDirectory() | |
| abs_path = pathlib.Path(temp_dir.name).absolute() | |
| file_url = urljoin('file:', pathname2url(str(abs_path))) | |
| # Return both the URL and the path to the temporary directory | |
| return file_url, abs_path | |
| @pytest.fixture | |
| def in_memory_catalog(): | |
| file_url, temp_dir = get_temp_dir() | |
| catalog = InMemoryCatalog("test.in_memory.catalog", warehouse=file_url) | |
| # Yield the fixture which actually runs the test | |
| yield catalog | |
| # Clean up after the catalog | |
| shutil.rmtree(temp_dir) | |
| @pytest.fixture | |
| def in_memory_catalog(): | |
| """Fixture to provide an in-memory catalog with a temporary directory.""" | |
| with tempfile.TemporaryDirectory() as temp_dir: | |
| abs_path = pathlib.Path(temp_dir).absolute() | |
| file_url = urljoin('file:', pathname2url(str(abs_path))) | |
| catalog = InMemoryCatalog("test.in_memory.catalog", warehouse=file_url) | |
| # Yield the fixture which actually runs the test | |
| yield catalog |
| # This is a simplification - in real code, you'd need to parse the expression | ||
| # to extract the sub-expressions properly | ||
| left_expr = None # You'd need to extract this | ||
| right_expr = None # You'd need to extract this | ||
| return And( | ||
| convert_pyarrow_expression(left_expr), | ||
| convert_pyarrow_expression(right_expr) | ||
| ) | ||
| elif "or" in expr_str.lower() and isinstance(expr, pc.Expression): | ||
| # Similar simplification | ||
| left_expr = None # You'd need to extract this | ||
| right_expr = None # You'd need to extract this | ||
| return Or( | ||
| convert_pyarrow_expression(left_expr), | ||
| convert_pyarrow_expression(right_expr) | ||
| ) | ||
| elif "not" in expr_str.lower() and isinstance(expr, pc.Expression): | ||
| # Similar simplification | ||
| inner_expr = None # You'd need to extract this | ||
| return Not(convert_pyarrow_expression(inner_expr)) |
There was a problem hiding this comment.
The conversion for logical operators uses placeholders (None) for the subexpressions, which may lead to invalid BooleanExpressions. Instead, consider raising a NotImplementedError or implementing proper parsing to extract the subexpressions.
| # This is a simplification - in real code, you'd need to parse the expression | |
| # to extract the sub-expressions properly | |
| left_expr = None # You'd need to extract this | |
| right_expr = None # You'd need to extract this | |
| return And( | |
| convert_pyarrow_expression(left_expr), | |
| convert_pyarrow_expression(right_expr) | |
| ) | |
| elif "or" in expr_str.lower() and isinstance(expr, pc.Expression): | |
| # Similar simplification | |
| left_expr = None # You'd need to extract this | |
| right_expr = None # You'd need to extract this | |
| return Or( | |
| convert_pyarrow_expression(left_expr), | |
| convert_pyarrow_expression(right_expr) | |
| ) | |
| elif "not" in expr_str.lower() and isinstance(expr, pc.Expression): | |
| # Similar simplification | |
| inner_expr = None # You'd need to extract this | |
| return Not(convert_pyarrow_expression(inner_expr)) | |
| if hasattr(expr, "arguments") and len(expr.arguments) == 2: | |
| left_expr, right_expr = expr.arguments | |
| return And( | |
| convert_pyarrow_expression(left_expr), | |
| convert_pyarrow_expression(right_expr) | |
| ) | |
| else: | |
| raise NotImplementedError("Parsing 'and' expressions requires two subexpressions.") | |
| elif "or" in expr_str.lower() and isinstance(expr, pc.Expression): | |
| if hasattr(expr, "arguments") and len(expr.arguments) == 2: | |
| left_expr, right_expr = expr.arguments | |
| return Or( | |
| convert_pyarrow_expression(left_expr), | |
| convert_pyarrow_expression(right_expr) | |
| ) | |
| else: | |
| raise NotImplementedError("Parsing 'or' expressions requires two subexpressions.") | |
| elif "not" in expr_str.lower() and isinstance(expr, pc.Expression): | |
| if hasattr(expr, "arguments") and len(expr.arguments) == 1: | |
| inner_expr = expr.arguments[0] | |
| return Not(convert_pyarrow_expression(inner_expr)) | |
| else: | |
| raise NotImplementedError("Parsing 'not' expressions requires one subexpression.") |
sankroh
left a comment
There was a problem hiding this comment.
Co-pilot has some points, but 👍
Tableimplementation with more better interface, etc.towerpackages