feat(RFC): A richer Expr IR#2572
Conversation
|
- Mentioned in (#2391 (comment)) - Needed again for #2572
at the moment it looks like this adds a self-standing |
* chore(typing): Add `_typing_compat.py` - Mentioned in (#2391 (comment)) - Needed again for #2572 * refactor: Reuse `TypeVar` import * refactor: Reuse `@deprecated` import * refactor: Reuse `Protocol38` import * docs: Add module-level docstring
Still need: - reprs - fix the hierarchy issue (#2572 (comment)) - Flag summing (#2572 (comment))
- 1 step closer to the understanding for (#2572 (comment)) - There's still some magic going on when `polars` serializes - Need to track down where `'collect_groups': 'ElementWise'` and `'collect_groups': 'GroupWise'` first appear - Seems like the flags get reduced
|
Thanks for peeking @MarcoGorelli
That is definitely the eventual goal! 🤞 Despite how quickly things have progressed, I still feel I'm a few steps behind being ready for that just yet. General overviewI'm trying to focus on modeling these structures and how they interact:
My thought was that So like what I have in Current
|
This comment was marked as resolved.
This comment was marked as resolved.
Can't tell if this means `FirstT` will match the entry `firstt`, but preserve the `firstt` fix (https://github.com/codespell-project/codespell#ignoring-words) (#2572 (comment))
|
I should've expected this, but it was a nice suprise to find we get hashable selectors for free 😄 from narwhals._plan import selectors as ndcs
>>> ndcs.matches("[^z]a")._ir == ndcs.matches("[^z]a")._ir
True
>>> ndcs.matches("[^z]a")._ir == ndcs.matches("abc")._ir
False@MarcoGorelli regarding (#2291) from narwhals._plan import selectors as ndcs
>>> ndcs.all()._ir == ndcs.all()._ir
True
lhs = ndcs.all()
rhs = ndcs.all().mean()
>>> lhs._ir == rhs._ir
False
>>> lhs._ir == rhs._ir.expr
TrueAnd the same holds for the non-selectors from narwhals._plan import demo as nwd
lhs = nwd.all()
rhs = nwd.all().mean()
>>> lhs._ir == rhs._ir
False
>>> lhs._ir == rhs._ir.expr
True
>>> type(rhs._ir)
narwhals._plan.aggregation.Mean |
An experiment towards (#2572 (comment))
| def test_valid_windows() -> None: | ||
| """Was planning to test this matched, but we seem to allow elementwise horizontal? | ||
|
|
||
| https://github.com/narwhals-dev/narwhals/blob/63c8e4771a1df4e0bfeea5559c303a4a447d5cc2/tests/expression_parsing_test.py#L10-L45 | ||
| """ | ||
| ELEMENTWISE_ERR = re.compile(r"cannot use.+over.+elementwise", re.IGNORECASE) # noqa: N806 | ||
| a = nwd.col("a") | ||
| assert a.cum_sum() | ||
| assert a.cum_sum().over(order_by="id") | ||
| with pytest.raises(InvalidOperationError, match=ELEMENTWISE_ERR): | ||
| assert a.cum_sum().abs().over(order_by="id") | ||
|
|
||
| assert (a.cum_sum() + 1).over(order_by="id") | ||
| assert a.cum_sum().cum_sum().over(order_by="id") | ||
| assert a.cum_sum().cum_sum() | ||
| assert nwd.sum_horizontal(a, a.cum_sum()) | ||
| with pytest.raises(InvalidOperationError, match=ELEMENTWISE_ERR): | ||
| assert nwd.sum_horizontal(a, a.cum_sum()).over(order_by="a") | ||
|
|
||
| assert nwd.sum_horizontal(a, a.cum_sum().over(order_by="i")) | ||
| assert nwd.sum_horizontal(a.diff(), a.cum_sum().over(order_by="i")) | ||
| with pytest.raises(InvalidOperationError, match=ELEMENTWISE_ERR): | ||
| assert nwd.sum_horizontal(a.diff(), a.cum_sum()).over(order_by="i") | ||
|
|
||
| with pytest.raises(InvalidOperationError, match=ELEMENTWISE_ERR): | ||
| assert nwd.sum_horizontal(a.diff().abs(), a.cum_sum()).over(order_by="i") |
There was a problem hiding this comment.
@MarcoGorelli quick question
This is adapted from an existing test:
tests.expression_parsing_test.test_window_kind
narwhals/tests/expression_parsing_test.py
Lines 10 to 45 in 63c8e47
AFAICT, all of the expressions I've needed a InvalidOperationError for shouldn't be valid.
But they aren't raising in current narwhals 🤔
1
import narwhals as nw
a = nw.col("a")
a.cum_sum().abs().over(order_by="id")This error explicitly mentions abs
narwhals/narwhals/_expression_parsing.py
Lines 357 to 362 in 9bd10ad
2, 3, 4
These are all raising the same as (1), but the issue seems to be that horizontal functions aren't being treated as elementwise
import narwhals as nw
a = nw.col("a")
nw.sum_horizontal(a, a.cum_sum()).over(order_by="a")
nw.sum_horizontal(a.diff(), a.cum_sum()).over(order_by="i")
nw.sum_horizontal(a.diff().abs(), a.cum_sum()).over(order_by="i")In polars, they all seem to be elementwise but with an additional flag
I've done the same in this PR, but I don't think that flag would factor into this?
narwhals/narwhals/_plan/functions.py
Lines 291 to 299 in 9bd10ad
* fix: Align `BinarySelector` repr w/ polars Thought it looked a bit weird in a doctest Turns out they use `()` in `Expr::BinaryExpr`, but never in `Selector::*` - https://github.com/pola-rs/polars/blob/7fc9f1875714fe9893c4d849b9593c1e4db1e854/crates/polars-plan/src/dsl/format.rs#L87 - https://github.com/pola-rs/polars/blob/7fc9f1875714fe9893c4d849b9593c1e4db1e854/crates/polars-plan/src/dsl/selector.rs#L641-L644 * docs: Explain `SelectorIR.to_dtype_selector` Towards #3497 * test: Fix `issubclass` coverage Not sure why this only came up recently * refactor: Rename `_matches` -> `_matches_dtype` * feat: Make `Empty` a concrete selector * perf: Add `SelectorIR.invert` simplification Couldn't get coverage for `AllDType`, `EmptyDType` * chore: give up on flaky cov * docs: Explain `SelectorIR.matches` Towards #3497 * chore(typing): Align `Series.sum` return with new polars pola-rs/polars#26629 * refactor: `iter_expand_names` -> `iter_expand_selector` Documenting this is one of the last selectors parts in #3497 May as well pick the name first * refactor: Simplify `expand_selectors` + friends * docs: Explain `SelectorIR.iter_expand_selector` Towards #3497 Adapted from https://docs.pola.rs/api/python/stable/reference/selectors.html#polars.selectors.expand_selector * feat(typing): Accept `Mapping[str, DType]` in `iter_expand_selector` * perf: Cache imports from `into_version` + finish the partial API + use it everywhere * docs: Align `BinaryExpr` with `BinarySelector` * refactor: Move `iter_output_name` from `RootSelector` -> `ByName` Wasn't possible in the (earlier) ADT version * docs: Explain `Column`, `All`, `ByName`, `ByIndex` Towards #3497 Highlights how this is based on the updated `polars` internals (pola-rs/polars#23351) * docs: Use "Arguments" some more Towards #3497 - pylance added support recently (can't find when) for the text showing in both `__init__` and on attribute access - there's still some larger docs I wanna keep on the attributes *for now* * docs: Explain `SelectorIR` Towards #3497 getting there indeed * chore: Mention known selectors gaps The time it would take to add tests is the only thing blocking these * chore: Address exception todos * test: Prepare for new combination expansion - Planning to partially revert (#3029 (comment)) - I made the wrong call on `when` - Still prefer the deviation for the other nodes * revert: "disallow multi-output in when (for now)" (b96dfd7) * feat: Support combination expansion in `when` Related: (90def5f), (8303f70) - Happy with it feature-wise - Implementation + docs need more polish * refactor: 2nd pass on `iter_expand_by_combination` * perf: Add fastpath for single many combination - Avoids the double zipping - Covers the only valid expansion on `main` - + allows the expansion on a leaf * test: Extra coverage for `ExprTraverser.names` cache - Per-class (`{Binary,Ternary}Expr`), a cache hit can come from any instance - This triggers another `(1, M, M)` case * docs: Explain `has_multiple_outputs` behavior See pola-rs/polars#23708 * refactor: Move, rename `seen_multi` * abandon indices idea added too much complexity for some that avoids a 2-3 string list * refactor: Make combination error self-documenting + display expansion sizes in the intuitive order * refactor: 3rd pass on `iter_expand_by_combination` Muuuuuch easier to read now * refactor: Don't use a set, when it can only have 1 member * docs: Start explaining combination expansion Towards #3497 Related to #3029 (comment) * refactor: Remove the dedicated `FillNan` + support in arrow - No need for it now `when` accepts selectors in any position -The impl is identical in - https://github.com/narwhals-dev/narwhals/blob/ca85e68dccbbba915d2f6c54483d48521ff91d3a/narwhals/_plan/arrow/functions/_multiplex.py#L186-L194 - Can still use that path for `ArrowSeries` * chore: Make `by_{name,index}` reprs less noisy - Defaults are omitted from repr - polars does this in a lot of places - I think it makes a lot of sense here since these are created *mostly* indirectly - `__str__` still shows them in full * fix(typing): Ensure `ExprNode` docstrings are visible - Noticed while trying to write docs in `ExprTraverser` - Quite a tricky problem to solve - The union of concrete classes produced multiple signatures - Landed on last solution because `SingleExpr.is_scalar` wasn't in the protocol - It didn't need to be there - New typing narrows just fine * chore: More planning expansion docs Towards #3497 Ideally there will be some (contextually relevant) bits sprinkled in all over * docs: Example for `ExprIR.iter_expand` Towards #3497 * docs: Explain `Expander.iter_expand_expressions` Towards #3497 * refactor: Remove `Expander.inner` Farewell my short lived friend * refactor: Move `iter_output_name` root semantics to `ExprTraverser` Makes selectors + renaming the special-cases, (rather than root nodes) * chore: Add note on `RollingExpr` removal * chore: De-prioritize `FunctionExpr` integration Experimented a bit, but was becoming a time-sink * docs: Explain expansion in `ExprNode` Towards #3497 Renamed methods after finally finding something that describes the relationship * docs: Explain `ExprTraverser.iter_expand` Towards #3497 Gonna save examples for `ExprIR.iter_expand` * docs: "leaf" -> "branch" * docs: Explain `ExprIR.iter_expand` Towards #3497 * chore: Temp improve `Expr` repr Remembered this lil idea #3213 (comment) * fix: Avoid creating binary selectors in `fill_nan` Quite a goof there! I missed that my test added `as_expr` on the selector case. Thrown in more selectors to be sure * docs: Explain `ExprTraverser.iter_expand_by_combination` Towards #3497 * chore: Add `IsScalar` elementwise note Need to get this done, but not just yet * refactor: Skip passing empty `ignored` for selector-only expansion Only needed when coming from `prepare_projection` - which this path never does * chore(typing): Widen `prepare_projection` from `Sequence` * docs: Nit `parse_expand_selectors` That detail was more important when collection logic wasn;t inside `Expander` * chore: Move and explain `expressions_to_schema` Related to #3497 * docs: Polish `prepare_projection`, `expand_selectors` Towards #3497 * refactor: `expressions_to_schema` -> `FrozenSchema.select_resolved` 2 birds * refactor: Tighten up `_expansion` API boundaries * docs: Explain `Expander` Towards #3497 * test: Shrink some boilerplate * fix: Reject non-length-preserving in `sort_by` - Adds `ExprIR.is_length_preserving` - Can integrate it more closely as a follow-up * fix: Accept length-preserving, non-elementwise in binary expressions `_is_filtration` is still incomplete, but this is correct for `FunctionExpr` now at least * fix: Don't mark `over` as non-length-preserving Not sure why `polars` does that, but it doesn't reject the same expression * fix: Reject length-changing in binary expressions I'm sure there's a reason I'm missing, but currently baffled by `changes_length`, `is_length_preserving`, `is_scalar` * perf: Simplify `FunctionExpr.changes_length` - Previously - (per-instance) had a worst-case of 2x `Flag.__contains__` + `not` - Now - the target (`_CHANGES_LENGTH`) is evaluated as a global - (per-instance) is a single `frozenset.__contains__`, which is cheaper than anything for `Flag` * chore: Explain weirdness in `FunctionFlags.__str__` * refactor: Split out `function_expr.py` - Prep for documenting in #3497 - Also scopes the blanket `[misc]` ignore, so we still get that reported in `expr.py` * refactor: Remove `RollingExpr` - It wasn't consistent with `polars` - Uses that name to represent `Expr.rolling` - I added it quite early (before dispatch) - `CumAgg` works fine without `CumExpr` - The other `FunctionExpr`s have more motivation than grouping * refactor: Deduplicate range validation * chore: Add `{Function,FunctionFlags}.is_length_preserving` * docs(DRAFT): Add some `FunctionExpr` basics - (Eventually) towards #3497 - Need to tidy up the validation logic first * feat: Add arity concept to `Function` - So far this just cleans things up - Fully utilizing it for expression dispatch should shrink things a lot - And be reusable across backends without subclassing * docs: Explain `Parameters` Towards #3497 * chore: todos * feat(DRAFT): Use `Parameters` for dispatch - (Mainly) performing some associated type *black magic* - Did a few replacements for coverage - Expect a lot more to change * refactor: Limit dependencies on `.version` * chore(typing): Use `HorizontalExpr` in compliant * chore(typing): `FrameT_contra -> `FrameT` Can be updated in other places, overall dislike this lint Doesn't compose well with non-single-letter variables * feat: Add `FunctionExpr.dispatch_args` Builds on the typing from (a09be54) Avoids the need for navigating through `parameters` and then passing `node` back in * feat(typing): Restricted `dispatch_arg` for `Unary` `mypy` seems to not understand the rest, but does handle the negative part of this * chore(typing): bump `mypy==1.20.1`, still no fancy fix 😭 Haven't figured out a solution yet, currently got 27 errors * todo * fix(typing): Less broken mypy * fix: Handle multi-inheritance of `DispatchOptions` Found out it was broken when trying a fix for (c4bcea0) * refactor: Add `Function` subclasses for arity This should be much easier from a typing perspective Unexpected bonus was some safety for `Expr._with_unary` * fix(typing): Help mypy with `FunctionExpr[UnaryFunction]` * refactor: Replace associated types with overloads Well, I had to try at least * feat(pyarrow): Add `unary` factory Need to finish porting over `_unary_function` * chore(pyarrow): Transition more to `unary` * refactor: Skip straight to `node.function` * ci(ruff): ignore some more names forgot to push this a while ago oops * chore: remove todo * chore(DRAFT): Add an accessor version of `unary` Featurewise, this is pretty close Left a lot of stuff to deduplicate * refactor(typing): Generalize from `UnaryFunction` -> `ExprIR` * give up * fix: stop requiring keyword-support in `CompliantExpr` - This allows the use of `Callable` instead of callback protocols - `Callable` has more special-casing in type checkers - It does not restrict impls to use positional-only - Just defines that they will be passed them by position - Which is fine, they're three parameters of different types * refactor: Remove some noise from signatures * fix: Remove some more required keyword support * chore: More `FunctionExpr` docs prep Towards #3497 * refactor: More `unary` usage * refactor(DRAFT): Rethink `dispatch`, `version` The dispatching part is done Still got a ways to go on factoring out `version` as state * refactor: Make `version` a classvar for compliant Part 2 of (e118036) Huge diff as I got close to fixing the variance issue, but not quite * chore: misc cleanup * feat(typing): More covariance * chore: remove `version` * feat: More progress on versions packages * chore: Add `Compliant{Expr,Scalar}.native` christ that took long to fix the typing * docs: Show type parameters on hover Lots of stuff to fix, need more visibility of the issues * refactor: Avoid referencing `CompliantSeries` Towards getting covariance in more places * fix(typing): Support defaults for `Scalar` Had to move the `len` definition out, since `Expr.from_python` isn't defined * chore: cleanup some experiments * tidier * docs: Remove outdated docs * refactor: Use native types in `io` protocols * woops, forgot that * feat(DRAFT): Everything is a plugin * test: commit that already * chore: typos * chore: imports lint * more typos * chore: not banned anymore * ci: pin `pyarrow<24` Related #3560, #3561 * feat: Implement plugin discovery Continued from (f82f351) * chore: cov * go away * fix(typing): Include `PolarsPlugin` in typing * feat(typing): Add `load_plugin` overloads and test them * rename, test, explain: `Plugin` dependency guards Resolves the overlap with `EntryPoint.load`, which is about the plugin import itself - `is_loaded` -> `is_imported` - `is_available` -> `can_import` * feat: More `_entry_points` validation - `load_plugin` is only a temporary api - need to share the validation with some alternative while I build them out * feat(typing): Add versioning to `__narwhals_classes__` * feat: Add `can_{eager,lazy}` * refactor: `unsupported_backend_operation_error` -> `unsupported_error` * explore replacing `LazyFrame.collect` w/ plugins incredibly rough, spent way too long trying to get mypy to work but no dice * prep to remove `PluginAny` * chore: move `sys_modules_targets` to `Plugin * fix(typing): Replace `PluginAny` I made everything covariant, but then expected it to work contravariantly * chore: remove `reveal_type` * fix(typing): Avoid 1 `[var-annotated]` Better than nothing * refactor: Remove housekeeping from `compliant.plugins` * refactor(typing): now those names are available * feat: Integrate `_narwhals_classes__` versioning Happy enough with this as a proof-of-concept Need to start wrapping this up in some classes * refactor: `Plugin.plugin_name` -> `Plugin.name` * refactor: `sys_modules_targets` -> `requirements` * feat(DRAFT): Add a plugin manager Nothing too fancy, just avoiding repeating some work and having consistent errors * chore: Use tuples for `__all__` can fold them and see on hover now * chore: Expose `load_plugin` * refactor: Move a bunch of types to `typing` * fix(typing): Preserve class versions in most cases Eventually need to do this with less complexity * farewell `TemporaryPluginsType` * chore: let's get repr'd * refactor: rename to `PluginManager` * test: cover `import_modules` * refactor: Make the impl of `can_import` use the cache * feat: Implelment `known`, `imported`, `importable` * sketch out `is_native_dataframe` Needs more work before replacing `translate.from_native_*` * add `SeriesV2` * just once will do * refactor: reuse `hasattrs_static` * fix(typing): More fiddling with guards * refactor: Move into a package * feat(DRAFT): Add a "parsed" plugin representation The big idea is to parse, don't validate * chore(typing): Close the `str` gap * parse into `PluginIR` on load * chore: easy repr * fix: Ensure both destructive paths parse plugins * feat: Populate plugin registry with accessors * feat: Add `PluginManager.dataframe` Quite happy with the ergonomics of this 😅 * refactor: Wrangle version strings * feat: Add `PluginManager.{lazyframe,series,evaluator}` Actually needed to change a lot more to fix the tests * refactor: Replace `_namespace.evaluator` with `PluginManager.evaluator` * chore: Remove `compliant.package` Worked well as an experiment and lead to something I quite like now * feat(typing): Add basic overloads on `PluginManager.<class-name>` * test: Cover `is_native_dataframe` * feat(typing): Add `PluginManager.plugin` overloads * chore: Move `import_classes` out of runtime and deprecate * refactor: Clean up accessors, rename * feat(DRAFT): Universal `@signeldispatch`???? * chore(typing): Reuse `compliant.typing` aliases * refactor(typing): `IntoBackendExt` -> `IntoPlugin` * refactor: `_backend_to_plugin_name` -> `_plugin_name` * refactor: Remove `require` Only had one use left since adding the registry * chore: Update notes * docs: Explain some of `PluginManager` ... and rename `_get_class` -> `_import_class` * feat: Use registry dispatch for `Series.from_native` * feat: Use registry dispatch for `DataFrame.from_native` * fix(typing): Remove default from `NativeFrameT_co` Was periodically causing these warnings: # `PluginManager.dataframe` (overload 1) > Could not specialize type "DataFrame[NativeDataFrameT_co@DataFrame, NativeSeriesT_co@DataFrame]" > "NativeFrame" is not assignable to "DataFrame" # `PluginManager.dataframe` (overload 2) > Could not specialize type "DataFrame[NativeDataFrameT_co@DataFrame, NativeSeriesT_co@DataFrame]" > "NativeFrame" is not assignable to "Table" * feat: Use registry dispatch for `LazyFrame.from_native` * fix: Make `backend`, `version` required in `ScanFile` * chore(typing): replace `typing_can_eager_lazy_integration` That's 1/2 done, then I can exorcise `import_classes` * chore: Remove everything `import_classes`-related * chore: move repr * fix: Pre-emptively avoid toctou * refactor: Make `from_native` less hacky * refactor: Simplify `from_iterable` * refactor: Simplify `from_dict` * refactor: Simplify `concat_series*` * refactor: Simplify `concat_df*` * refactor: Simplify eager IO * refactor: Simplify lazy IO * perf: Use `__slots__` across the entire `Compliant*`-level# Only 2 classes have issues to resolve (and 1 is named `Resolve` 😉) * fix: Remove unused `NativeDataFrameT_co` from `EagerNamespace` * refactor: Simplify eager range constructors * refactor: Remove `eager_implementation` * refactor: Remove `known_implementation` * refactor: Move `namespace` * refactor: Simplify `read_{csv,parquet}_schema` Annoying that these don't fit in with the rest of the model yet They really do need a seperate API though, since `pyarrow` can provide this natively for `pandas` as well * chore: Remove superseded `unsupported_error` * refactor: Rename `len` -> `len_star` at compliant level The duplicate name uses for `ns.len` and `expr.len` was keeping the need for `Namespace` * docs: Remove outdated `Lit.is_scalar` example * refactor: Actually rename `Column`, `Len` Left aliases behind, since `Column` is referenced literally everywhere * refactor: Make `lit_series` a classmethod * refactor: Make `len_star` a classmethod * refactor: Make `lit` a classmethod * refactor: Make `col` a classmethod * typo * chore: deprecate `namespaced` * chore(typing): Add typing to `constructor` binding * refactor: Move `concat_str` from namespace * refactor: Move `mean_horizontal` from namespace * refactor: Move all horizontal functions from namespace * refactor: remove `namespace` helper * refactor: Introduce `CompliantColumn` Quite relieved with all the overrides that are gone now * fix: CONTRAVARIANCE ONCE AGAIN!!!! * refactor: Move `int_range` from namespace * refactor: Move `date_range`, `linear_space` from namespace * chore: Remove more traces of namespace * chore(typing): Prep new dispatch typing * fix(typing): Avoid LSP visibility bug * chore: remove not planned * refactor: Remove `ExprIR.__init_subclass__(dispatch="no_dispatch")` Pretty deep into a refactor right now, but this part can go in early * refactor: Move `DispatchOptions` merge * perf: Zero-cost developer hints * chore: all constructors are final * remove `"no_dispatch"` there too * refactor: Remove `LiteralExpr` I'm adding a common base for `Lit`, `LitSeries`, `Col`, `LenStar` This was just in the way * stop being fancy and use classes! * chore: Update more refs to dispatch * refactor: just keep decoupling * refactor: move `pascal_to_snake_case` * refactor: Expression dispatch w/o `*Namespace` - Truly the most insane commit (apologies future reader) - Changes a very core assumption - that everywhere can access `__narwhals_namespace__` * chore: remove more namespace * chore: typos * revert: modin filter warning * surely not, right? maybe fixes: > TypeError: Parameters to generic types must be types. Got <narwhals._plan.translate.ParamSpec object at 0x7f72e4cf4dc0>. * well, how about this? > TypeError: type.__new__() takes exactly 3 arguments (0 given) * empty parametrize? > fixture 'lazyframe' not found > fixture 'lazy' not found * test: more empty fixtures plz * plz * pin pyright
`lit(Series)` was really bugging me with how it should be showing the backend
commit 9d015e2031acd5e10404da72a7d698c632446c49
Author: Marco Edward Gorelli <33491632+MarcoGorelli@users.noreply.github.com>
Date: Sat May 16 09:51:39 2026 +0100
release: Bump version to 2.21.2 (#3630)
commit 8674e44f657110b45aedffb8c1343d5b0505bb01
Author: Marco Edward Gorelli <33491632+MarcoGorelli@users.noreply.github.com>
Date: Sat May 16 09:37:56 2026 +0100
release: Bump version to 2.21.1 (#3628)
commit 739dc579adfd38a75764d6d2c09ed42b0f2e887a
Author: Marco Edward Gorelli <33491632+MarcoGorelli@users.noreply.github.com>
Date: Sat May 16 09:35:33 2026 +0100
ci: remove `downstream_tests_slow` (#3629)
commit 1f36279bb07203628ddd18a691cf98d4c78780d3
Author: Marco Edward Gorelli <33491632+MarcoGorelli@users.noreply.github.com>
Date: Sat May 16 09:17:10 2026 +0100
Revert "fix: Allow `float('nan')` as value in join for duckdb (#3555)" (#3627)
This reverts commit 0d7f352.
commit 1e50d020e3ca5e5c5dfe3486e53190e87c60b66c
Author: Pedro <pedro.villanueva@booking.com>
Date: Fri May 15 13:05:01 2026 +0200
[Enh]: Add the negation unary operator for expressions and series (#3625)
negation unary operator for expressions and series
commit 37ea7953f6b455736bf6ae353ca257f468b3b083
Author: Francesco Bruzzesi <42817048+FBruzzesi@users.noreply.github.com>
Date: Wed May 13 10:38:42 2026 +0200
ci: Unpin "temporary" CI pins (#3618)
* ci: Unpin 'temporary' CI pins
* rollback formulaic and pointblank
commit 4ff3a1f545924e5e38e76c4d8a71de2010a2f4d6
Author: Francesco Bruzzesi <42817048+FBruzzesi@users.noreply.github.com>
Date: Tue May 12 19:11:11 2026 +0200
chore: Prepare for future pandas inplace deprecation (#3616)
This reverts commit 5dee22d.
Accidentally untracked all of these lol
That move destroyed the history
for more information, see https://pre-commit.ci
Will close #2571
What type of PR is this? (check all applicable)
Related issues
Exprinternal representation #2571(sort:updated-desc "(expr-ir)/" in:title)Checklist
If you have comments or can explain your changes, please do so below
Important
See (#2571) for detail!!!!!!!
Very open to feedback
Tasks
Show 2025 May-July
pl.Expr.metapl.Expr.meta)ExprIRmetamethods_typing_compatmodule #2578Merge another PR with (perf: Avoid module-levelimportlib.util.find_spec#2391 (comment)) firstTypeVardefaults moreTypeVar("T", bound=Thing, default=Thing), instead of an opaqueExprIRSelector(s)narwhals/narwhals/_plan/expr.py
Lines 336 to 337 in 0bada48
BinaryExprthat describes the restricted set of operators that are allowedpolars, since they wrappl.colinternallyIntoExprin more places (including and beyond whatnarwhalsallows now)demo.py*_horizontalconcat_str)dummy.pyover,sort_by)FunctionOptions+ friends see commentWhere does the{flags: ...}->{collect_groups: ..., flags: ...}expansion happen?polars>=1.3.0fixed the issue (see comment)Ternarywhen-then-otherwise🥳)Metais_*,has_*,output_namemeta methodsroot_namesundo_aliases,popNamename.py(Expr::KeepName,Expr::RenameAlias)polarswill help with themetamethodsCat,Struct,List(a3e29d1)String(72c33ce)DateTime(aee0a7e)_expression_parsing.pyrulesrustversion worksExpansionFlagsexpand_function_inputsrewrite_projectionsreplace_selectorexpand_selectorreplace_selector_innerreplace_and_add_to_resultsreplace_nthprepare_excludedexpand_columnsexpand_dtypesreplace_dtype_or_index_with_columndtypes_match(probably can solve w/ existingnarwhals)expand_indicesreplace_index_with_columnreplace_wildcardrewrite_special_aliasesreplace_wildcard_with_columnreplace_regexexpand_regexExprIR.map_irExprIR #2572 (comment))ExprIR.map_irfor most nodesWindowExpr.map_irFunctionExpr.map_irRollingExpr,AnonymousExprinheritselectorsExprIR(main) #3066ExprIR(main) #3066 (comment))_planpackage #3122group_by, utilizepyarrow.acero#3143{Expr,Series}.{first,last}#2528)protocols.py#3166order_by,hashjoin,DataFrame.{filter,join},Expr.is_{first,last}_distinct#3173__dict__appearing onImmutablesubclasses (thread)__slots__, and not__dict__too #3201ExpansionFlags.from_ir#3206ColumnNameOrSelectorcan be addedExpr.meta.serialize,Expr.deserializepc.Expressionover(*partition_by)#3224rank,with_row_index_by,over(*partition_by, order_by=...)#3295Selectoroverhaul #3233date_range, support{date,int}_range(eager=...)#3294ArrowExpr#3325dt.*,str.to_date(time)dt.*str.to_date(time)ExprIRwith in sync withnarwhalslist.*aggregate methods #3353list.sort#3359{Expr,Series}.any_value#3315overwith currentpolars#3352ArrowExprboilerplatearrow.functionsa package #3384ArrowListNamespace-> (ArrowExprListNamespace,ArrowScalarListNamespace)len,getListScalarEager{DataFrame,Namespace}when thenarwhals-level depends on themconcatcommentSelectors that do more than select #3376LogicalPlan🚀over(*partition_by)windows between expressions in the same contextmaincan do this for multi-output (by not expanding)LogicalPland3fc48a1a0b9291ae6bac21af27e1039e63ba3cf/python/cudf_polars/cudf_polars/dsl/ir.py)
DataTypepropagation and supertyping rulesconcat(..., how={"vertical_relaxed", "diagonal_relaxed"})#3386LogicalPlansnarwhalsbehavior, but likely more of an issue whenSelectors can be used in more placesoh-nodestopolarsinternalsExpr) features forLogicalPlanconcat#3371read_{csv,parquet}#3370DataFrame.write_{csv,parquet}#3369BaseFrameDataFrame.join_asof#3378BaseFrame.unpivot#3368BaseFrame.unnest#3414DataFrameDataFrame.pivot#3373DataFrame.unique#3364