Transform string-based expressions into Polars DataFrame operations. Write simple, SQL-like expressions and let the library convert them to optimized Polars code.
There is an interactive playground and function reference that runs the library in the browser through Pyodide. You can try expressions on sample data and see the generated Polars and FlowFrame code without installing anything.
The site lives in docs/; the function reference is generated from the docstrings with python generate_docs.py.
import polars as pl
from polars_expr_transformer import simple_function_to_expr
df = pl.DataFrame({
'first_name': ['John', 'Jane', 'Bob'],
'last_name': ['Doe', 'Smith', 'Johnson'],
'age': [30, 25, 45],
'salary': [50000, 60000, 75000]
})
# Concatenate columns
df.select(simple_function_to_expr('concat([first_name], " ", [last_name])').alias('full_name'))
# Conditional logic
df.select(simple_function_to_expr('if [age] > 30 then "Senior" else "Junior" endif').alias('level'))
# Math operations
df.select(simple_function_to_expr('[salary] * 1.1').alias('new_salary'))
# Combine multiple operations
df.select(simple_function_to_expr('uppercase(left([last_name], 3))').alias('code'))pip install polars-expr-transformer| Use Case | Recommendation |
|---|---|
| Building applications with user-defined transformations | ✅ Yes - Users can write expressions without Python knowledge |
| SQL/Tableau users transitioning to Polars | ✅ Yes - Familiar syntax |
| Need a simple expression language for configs | ✅ Yes - Easy to serialize and store |
| Writing performance-critical Polars code | ❌ No - Use Polars directly |
| Need all Polars features | ❌ No - This covers common operations only |
Reference DataFrame columns using square brackets:
'[column_name]' # Reference a column
'[Column With Spaces]' # Columns with spaces work tooBesides column references, you can write literal values directly. Five literal types are supported:
| Type | How to write it | Examples |
|---|---|---|
| String | Single or double quotes | "hello", 'world' |
| Integer | Bare whole numbers (negatives allowed) | 42, -7 |
| Float | Bare decimal numbers | 3.14, -0.5 |
| Boolean | true or false (case-insensitive) |
true, False |
| Null | null (case-insensitive) — the missing value |
null |
'if [active] = true then "yes" else "no" endif' # boolean literal
'[price] * 1.1' # float literal
'coalesce([nickname], null)' # null literal| Operator | Description | Example |
|---|---|---|
+ |
Addition | [a] + [b] |
- |
Subtraction | [a] - 10 |
* |
Multiplication | [price] * [quantity] |
/ |
Division | [total] / [count] |
% |
Modulo | [value] % 2 |
= or == |
Equals | [status] = "active" |
!= |
Not equals | [type] != "deleted" |
>, >=, <, <= |
Comparisons | [age] >= 18 |
and |
Logical AND | [a] > 0 and [b] > 0 |
or |
Logical OR | [x] = 1 or [y] = 1 |
# Simple if-then-else
'if [age] >= 18 then "Adult" else "Minor" endif'
# Multiple conditions with elseif
'if [score] >= 90 then "A" elseif [score] >= 80 then "B" elseif [score] >= 70 then "C" else "F" endif'
# Nested conditions
'if [type] = "A" then (if [value] > 100 then "High A" else "Low A" endif) else "Other" endif'# Single-line comments with //
'[column] + 1 // This adds one to the column'
# Multi-line expressions with comments
'''
[price] * [quantity] // Calculate subtotal
- [discount] // Apply discount
'''| Function | Description | Example |
|---|---|---|
concat(a, b, ...) |
Concatenate strings | concat([first], " ", [last]) |
length(text) |
String length | length([name]) |
uppercase(text) |
Convert to uppercase | uppercase([code]) |
lowercase(text) |
Convert to lowercase | lowercase([email]) |
titlecase(text) |
Convert to title case | titlecase([name]) |
left(text, n) |
First n characters | left([phone], 3) |
right(text, n) |
Last n characters | right([id], 4) |
mid(text, start, len) |
Substring from position | mid([code], 2, 3) |
substring(text, start, len) |
Alias for mid | substring([text], 0, 10) |
trim(text) |
Remove leading/trailing spaces | trim([input]) |
left_trim(text) |
Remove leading spaces | left_trim([text]) |
right_trim(text) |
Remove trailing spaces | right_trim([text]) |
replace(text, find, replace) |
Replace text | replace([name], ".", "") |
find_position(text, search) |
Find substring position | find_position([text], "@") |
pad_left(text, len, char) |
Pad string on left | pad_left([id], 5, "0") |
pad_right(text, len, char) |
Pad string on right | pad_right([code], 10, " ") |
starts_with(text, prefix) |
Check prefix | starts_with([url], "https") |
ends_with(text, suffix) |
Check suffix | ends_with([file], ".csv") |
reverse(text) |
Reverse string | reverse([text]) |
repeat(text, n) |
Repeat string n times | repeat("*", 5) |
split(text, delimiter) |
Split into list | split([tags], ",") |
count_match(text, pattern) |
Count occurrences | count_match([text], "a") |
string_similarity(a, b, method) |
Similarity score (0-1) | string_similarity([a], [b], "levenshtein") |
| Function | Description | Example |
|---|---|---|
abs(n) |
Absolute value | abs([difference]) |
round(n, decimals) |
Round to decimals | round([price], 2) |
ceil(n) |
Round up | ceil([value]) |
floor(n) |
Round down | floor([value]) |
power(base, exp) |
Exponentiation | power([x], 2) |
pow(base, exp) |
Alias for power | pow(2, [n]) |
sqrt(n) |
Square root | sqrt([area]) |
log(n) |
Natural logarithm | log([value]) |
log10(n) |
Base-10 logarithm | log10([value]) |
log2(n) |
Base-2 logarithm | log2([value]) |
exp(n) |
e^n | exp([rate]) |
mod(a, b) |
Modulo | mod([value], 10) |
sign(n) |
Sign (-1, 0, 1) | sign([change]) |
negation(n) |
Negate value | negation([amount]) |
sin(n), cos(n), tan(n) |
Trigonometric | sin([angle]) |
asin(n), acos(n), atan(n) |
Inverse trig | asin([ratio]) |
tanh(n) |
Hyperbolic tangent | tanh([x]) |
random_int(min, max) |
Random integer | random_int(1, 100) |
| Function | Description | Example |
|---|---|---|
now() |
Current datetime | now() |
today() |
Current date | today() |
year(date) |
Extract year | year([created_at]) |
month(date) |
Extract month (1-12) | month([date]) |
day(date) |
Extract day (1-31) | day([date]) |
hour(datetime) |
Extract hour (0-23) | hour([timestamp]) |
minute(datetime) |
Extract minute | minute([time]) |
second(datetime) |
Extract second | second([time]) |
week(date) |
ISO week number (1-53) | week([date]) |
weekday(date) |
Day of week (1=Mon, 7=Sun) | weekday([date]) |
dayofweek(date) |
Alias for weekday | dayofweek([date]) |
quarter(date) |
Quarter (1-4) | quarter([date]) |
dayofyear(date) |
Day of year (1-366) | dayofyear([date]) |
add_days(date, n) |
Add days | add_days([start], 30) |
add_weeks(date, n) |
Add weeks | add_weeks([date], 2) |
add_months(date, n) |
Add months | add_months([date], 6) |
add_years(date, n) |
Add years | add_years([birth], 18) |
add_hours(dt, n) |
Add hours | add_hours([time], 3) |
add_minutes(dt, n) |
Add minutes | add_minutes([time], 30) |
add_seconds(dt, n) |
Add seconds | add_seconds([time], 60) |
date_diff_days(a, b) |
Days between dates | date_diff_days([end], [start]) |
datetime_diff_seconds(a, b) |
Seconds between | datetime_diff_seconds([a], [b]) |
format_date(date, fmt) |
Format as string | format_date([date], "%Y-%m-%d") |
start_of_month(date) |
First of month | start_of_month([date]) |
end_of_month(date) |
Last of month | end_of_month([date]) |
date_truncate(date, unit) |
Truncate to unit | date_truncate([dt], "1day") |
| Function | Description | Example |
|---|---|---|
equals(a, b) |
Check equality | equals([status], "active") |
does_not_equal(a, b) |
Check inequality | does_not_equal([type], "deleted") |
is_empty(value) |
Check if null | is_empty([email]) |
is_not_empty(value) |
Check if not null | is_not_empty([phone]) |
coalesce(a, b, ...) |
First non-null | coalesce([nickname], [name], "Unknown") |
ifnull(value, default) |
Replace null | ifnull([count], 0) |
nvl(value, default) |
Alias for ifnull | nvl([value], 0) |
nullif(a, b) |
Null if equal | nullif([value], 0) |
between(val, min, max) |
Range check (inclusive) | between([age], 18, 65) |
greatest(a, b, ...) |
Maximum value | greatest([a], [b], [c]) |
least(a, b, ...) |
Minimum value | least([price1], [price2]) |
contains(text, search) |
Contains substring | contains([desc], "sale") |
_in(value, text) |
Value in text | _in("admin", [roles]) |
_not(value) |
Logical NOT | _not([is_deleted]) |
is_string(value) |
Type check | is_string([field]) |
| Function | Description | Example |
|---|---|---|
to_string(value) |
Convert to string | to_string([id]) |
to_integer(value) |
Convert to integer | to_integer([count]) |
to_float(value) |
Convert to float | to_float([price]) |
to_number(value) |
Alias for to_float | to_number([value]) |
to_boolean(value) |
Convert to boolean | to_boolean([flag]) |
to_date(text, format) |
Parse date | to_date([date_str], "%Y-%m-%d") |
to_datetime(text, format) |
Parse datetime | to_datetime([ts], "%Y-%m-%d %H:%M:%S") |
to_decimal(value, precision) |
Convert with precision | to_decimal([amount], 2) |
Converts a string expression to a Polars expression.
from polars_expr_transformer import simple_function_to_expr
expr = simple_function_to_expr('[price] * [quantity]')
df.select(expr.alias('total'))Returns the intermediate function object for inspection/debugging.
from polars_expr_transformer import build_func
func = build_func('concat([a], [b])')
print(func.get_readable_pl_function()) # See the Polars translationReturns a list of all available function names.
from polars_expr_transformer import get_all_expressions
functions = get_all_expressions()
print(functions) # ['concat', 'length', 'uppercase', ...]Returns functions grouped by category with descriptions.
from polars_expr_transformer import get_expression_overview
for category in get_expression_overview():
print(f"\n{category.category}:")
for expr in category.expressions:
print(f" {expr.name}: {expr.description}")The library validates expressions before parsing and raises ExpressionSyntaxError
(a subclass of ValueError) with the exact position of the problem and a hint:
# Misspelled keyword
simple_function_to_expr('f [age] > 30 then "Senior" else "Junior" endif')
# ExpressionSyntaxError:
# Found 'then' at position 14, but there is no 'if' before it.
# f [age] > 30 then "Senior" else "Junior" endif
# ^
# Hint: Every condition starts with 'if': if <condition> then <value> else <value> endif.
# Check that 'if' is present and spelled correctly.
# Unbalanced parentheses
simple_function_to_expr('((1)')
# ExpressionSyntaxError:
# Unbalanced parentheses: '(' at position 1 is never closed.
# ((1)
# ^
# Hint: Add a matching ')'.
# Unknown function
simple_function_to_expr('unknown_func([col])')
# ExpressionSyntaxError: Expected a single value, but found 2. This usually means
# a function name is misspelled or unknown, or an operator is missing between two values.Catch errors with except ExpressionSyntaxError (importable from the package root)
or simply except ValueError.
This library is built on top of Polars, a blazingly fast DataFrame library written in Rust. All expressions are converted to native Polars operations, ensuring optimal performance.
Contributions are welcome! Please feel free to submit issues and pull requests on GitHub.
MIT License - see LICENSE file for details.
Thanks to the Polars team for creating such an amazing library.