Skip to content

feat(expr-ir): Improve lit repr#3641

Merged
dangotbanned merged 7 commits into
oh-nodesfrom
expr-ir/represent
May 19, 2026
Merged

feat(expr-ir): Improve lit repr#3641
dangotbanned merged 7 commits into
oh-nodesfrom
expr-ir/represent

Conversation

@dangotbanned
Copy link
Copy Markdown
Member

@dangotbanned dangotbanned commented May 18, 2026

Description

Brief interlude before getting back to (#3497)

Note

Just trying to keep myself sane with something unimportant 😅

Before

Writing the docs (and doc tests) has made me very unhappy with whatever this was.

.when([(col('y')) == (lit(str: b))]).then(lit(int: 1)).otherwise(lit(null))

After

A little less noisy

.when([(col('y')) == (lit('b'))]).then(lit(1)).otherwise(lit(None))

Those were smallest examples I could find.

These are the exhaustive tests

Hah, made you look 😉

@pytest.mark.parametrize(
("exprs", "expected"),
[
(
[
nwp.lit(1),
nwp.lit(1, nw.UInt8),
nwp.lit(1, nw.Int32),
nwp.lit(None, nw.Int64),
],
"[lit(1), lit[u8](1), lit[i32](1), lit[i64](None)]",
),
(nwp.int_range(nwp.len()), "int_range([lit(0), len()])"),
(nwp.int_range(0, 10), "int_range([lit(0), lit(10)])"),
(
[
nwp.lit(1.479).alias("renamed"),
nwp.lit(14.2, nw.Float32),
nwp.lit(None, nw.Float64),
],
"[lit(1.479).alias('renamed'), lit[f32](14.2), lit[f64](None)]",
),
(
(
nwp.col("one"),
nwp.lit(["two"]),
nwp.lit(None, nw.List(nw.String)),
nwp.lit([], nw.List(nw.String)),
nwp.lit(["a", "b", "c", "d", "e"]),
nwp.lit(["a", None, "c"]),
),
"[col('one'), "
"lit[list](['two']), "
"lit[list[str]](None), "
"lit[list[str]]([]), "
"lit[list[str]]([...]), "
"lit[list](['a', None, 'c'])]",
),
(
(
nwp.lit(dt.date(2000, 1, 1)),
nwp.lit(None, nw.Date),
nwp.lit(dt.time(9, 30, 2, 9)),
nwp.lit(None, nw.Time),
),
"[lit[date]('2000-01-01'), "
"lit[date](None), "
"lit[time]('09:30:02.000009'), "
"lit[time](None)]",
),
(
(
nwp.lit(dt.datetime(2032, 8, 29, 14, 40, 26, 10)),
nwp.lit(dt.datetime(2010, 8, 29), nw.Datetime("ns")),
nwp.lit(dt.datetime(1986, 1, 1, 1, 1, 1), nw.Datetime(time_zone="UTC")),
),
"[lit[datetime]('2032-08-29T14:40:26.000010'), "
"lit[datetime[ns]]('2010-08-29T00:00:00'), "
"lit[datetime[us, UTC]]('1986-01-01T01:01:01')]",
),
(
[
nwp.lit(dt.timedelta(12)),
nwp.lit(dt.timedelta(5, 1, 5)),
nwp.lit(dt.timedelta(99), nw.Duration("ms")),
nwp.lit(dt.timedelta()),
nwp.lit(dt.timedelta(seconds=123)),
nwp.lit(dt.timedelta(seconds=456, microseconds=789)),
nwp.lit(None, nw.Duration("s")),
nwp.lit(None, nw.Duration),
],
"[lit[duration]('12d'), "
"lit[duration]('5d 1s 5us'), "
"lit[duration[ms]]('99d'), "
"lit[duration]('0'), "
"lit[duration]('123s'), "
"lit[duration]('456s 789us'), "
"lit[duration[s]](None), "
"lit[duration](None)]",
),
(
[
nwp.lit(decimal.Decimal("0.37392")),
nwp.lit(decimal.Decimal("0.37392"), nw.Decimal(5)),
nwp.lit(decimal.Decimal("0.37392"), nw.Decimal(5, 1)),
nwp.lit(None, nw.Decimal),
nwp.lit(None, nw.Decimal(4, 2)),
],
"[lit[decimal]('0.37392'), "
"lit[decimal[5,0]]('0.37392'), "
"lit[decimal[5,1]]('0.37392'), "
"lit[decimal](None), "
"lit[decimal[4,2]](None)]",
),
(
[
nwp.lit("abcdef"),
nwp.lit(b"abcdef"),
nwp.lit(None, nw.String),
nwp.lit(None, nw.Binary),
],
"[lit('abcdef'), lit(b'abcdef'), lit[str](None), lit[binary](None)]",
),
(
[nwp.lit(True), nwp.lit(False), nwp.lit(None, nw.Boolean)],
"[lit(True), lit(False), lit[bool](None)]",
),
(
[nwp.lit("a", nw.Categorical), nwp.lit(None, nw.Categorical)],
"[lit[cat]('a'), lit[cat](None)]",
),
(
[
nwp.lit("a", nw.Enum(["a", "b"])),
nwp.lit("a", nw_v1.Enum()),
nwp.lit(None, nw.Enum(["a", "b"])),
nwp.lit(None, nw_v1.Enum()),
],
"[lit[enum]('a'), lit[enum]('a'), lit[enum](None), lit[enum](None)]",
),
(
[
nwp.lit(
{"hello": None, "there": 99},
nw.Struct({"hello": nw.Array(nw.Int64, 1), "there": nw.UInt128}),
),
nwp.lit({"a": 1, "b": 5, "c": 10}),
nwp.lit(None, nw.Struct({"a": nw.Boolean})),
nwp.lit({}, nw.Struct([])),
],
"[lit[struct[2]]({'hello': None, 'there': 99}), "
"lit[struct[3]]({'a': 1, 'b': 5, 'c': 10}), "
"lit[struct[1]](None), "
"lit[struct[0]]({})]",
),
(
[
nwp.lit([1, 2, 3], nw.Array(nw.Int64, 3)),
nwp.lit([1, None], nw.Array(nw.Int32, 2)),
nwp.lit(["hi"], nw.Array(nw.String, 1)),
nwp.lit([None, None, None, None], nw.Array(nw.Float32, 4)),
nwp.lit((1, 2, 3, 4, 5), nw.Array(nw.UInt8, 5)),
],
"[lit[array[i64, 3]]([1, 2, 3]), "
"lit[array[i32, 2]]([1, None]), "
"lit[array[str, 1]](['hi']), "
"lit[array[f32, 4]]([None, None, None, None]), "
"lit[array[u8, 5]]([...])]",
),
],
)
def test_lit(exprs: nwp.Expr | Sequence[nwp.Expr], expected: LiteralString) -> None:
# NOTE: Checking both how `lit` looks like in isolation, and when appearing inside/alongside other expressions
# The shape of the test code is intended to make the visual comparison in the `parametrize` cases easier to read
if isinstance(exprs, nwp.Expr):
exprs = (exprs,)
if len(exprs) == 1:
assert_expr_ir_equal(exprs[0], expected)
else:
actual = "[" + (", ".join(repr(e._ir) for e in exprs)) + "]"
assert actual == expected
def test_lit_series(series: Series) -> None:
# NOTE: Don't make this parametric, the idea is to see the full string
if series.is_polars():
expected = "lit(Series[pl.Series])"
elif series.is_pyarrow():
expected = "lit(Series[pa.ChunkedArray])"
else:
raise NotImplementedError(series.identifier)
assert_expr_ir_equal(nwp.lit(series([True, False, True])), expected)
def test_lit_object() -> None:
class What:
def __repr__(self) -> str:
return "12345"
obj = What()
expr = nwp.lit(obj, nw.Object)
assert_expr_ir_equal(expr, "lit[object](12345)")

What's changed?

  • Worked backwards from what pl.lit(...) does
    • Mainly using the same value repr, but wrapping it instead of things like "im a string"
  • The "short dtype repr code" is displayed in siuations that are ambiguous
    • If the value repr is a string, but the type is not String
      • E.g. Enum, Categorical, Decimal, Temporal
    • If the type was provided explicitly and cannot be inferred
      • Numeric (excluding Int64, Float64)
      • Anything else that isn't Boolean, String or Binary
    • The value is None, but there is an explicit dtype

Related issues

- Child of #2572
- Been bugging me for a while now
- Gets *closer* to what `pl.lit` has
  - But always displays `lit(...)`
  - And displays `dtype` when ambiguous
```
Name                                     Stmts   Miss Branch BrPart  Cover   Missing
------------------------------------------------------------------------------------
narwhals\_plan\expressions\boolean.py       41      0      4      1    98%   45->46
narwhals\_plan\expressions\temporal.py      65      0      4      1    99%   81->82
```
@dangotbanned dangotbanned added enhancement New feature or request internal labels May 18, 2026
Comment on lines +45 to +46
class AllHorizontal(_HorizontalBoolean): ...
class AnyHorizontal(_HorizontalBoolean): ...
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somehow this was causing coverage to think these lines were uncovered? 😂

@dangotbanned dangotbanned marked this pull request as ready for review May 18, 2026 22:02
@dangotbanned dangotbanned merged commit 0af706d into oh-nodes May 19, 2026
40 of 41 checks passed
@dangotbanned dangotbanned deleted the expr-ir/represent branch May 19, 2026 10:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request internal

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant