Status: Minimal POC (MR 3 of the join-rework). Converts any RF of nested TFs into SQL-style flat rows. Acyclic reference graphs of arbitrary depth are supported; cyclic references raise
ValueError.
out: RF = flatten(rf: RF).resultflatten accepts any RF whose row values are TFs with AF-valued
attributes — typically the output of join — and returns
a new RF of flat TFs with dot-separated keys:
# join output (FDM-native nested shape):
out[0] = TF({"users": TF({"name": "Alice", "dept": TF({"name": "Dev"})}),
"departments": TF({"name": "Dev"})})
# after flatten:
out[0] = TF({"users.name": "Alice",
"users.dept.name": "Dev",
"departments.name": "Dev"})flatten deliberately trades FDM's zero-redundancy for SQL-style
convenience: two paths that reach the same leaf value (here
"users.dept.name" and "departments.name") both appear in the
output with copies of the scalar. The FDM-native nested shape from
join is unchanged and remains the default.
from fdm.attribute_functions import TF, RF, DBF
from fql.operators.joins import join
from fql.operators.flatten import flatten
departments = RF({"d1": TF({"name": "Dev"}),
"d2": TF({"name": "Sales"})}, frozen=True)
users = RF({"u1": TF({"name": "Alice", "dept": departments["d1"]}),
"u2": TF({"name": "Bob", "dept": departments["d2"]})},
frozen=True).references("dept", departments)
dbf = DBF({"users": users, "departments": departments}, frozen=True)
out: RF = flatten(join(dbf)).result
for row in out:
print(row.value["users.name"], "->", row.value["departments.name"])
# Alice -> Dev
# Bob -> SalesEach output key is a dot-separated path from the top-level attribute name down to the scalar leaf:
top-level key → "users" (TF-valued: expanded, not emitted)
first level → "users.name" (scalar: emitted)
ref inside TF → "users.dept" (TF-valued: expanded recursively)
second level → "users.dept.name" (scalar: emitted)
Keys whose source values are plain scalars at the top level are emitted verbatim without any dot prefix:
flat_rf = flatten(RF({0: TF({"x": 1, "y": 2}, frozen=True)}, frozen=True)).result
assert flat_rf[0].keys() == {"x", "y"}AF-valued attributes are expanded recursively to any depth. A linear
chain tasks → projects → departments produces keys at all three
levels from a single flatten call:
tasks.desc
tasks.project.title
tasks.project.dept.name
projects.title
projects.dept.name
departments.name
Computed attributes are materialised as static scalars at flatten time; domain-backed default attributes are included when the source TF has a finite domain.
flatten accepts any Operator producing an RF — no intermediate
.result call is needed:
out: RF = flatten(join(dbf)).resultThe lazy operator chain resolves only when .result is accessed.
Key collisions: when two distinct paths produce the same
dot-separated key (e.g. both "users.dept.name" and
"departments.name"), the last path visited wins (depth-first,
left-to-right). This is intentional and documented, not a bug — both
values are the same scalar when the reference graph is consistent.
Attribute names that already contain dots may cause unexpected collisions;
this is a known limitation for this POC.
Cycle safety: flatten detects reference cycles via a visited-set
guard and raises ValueError with the cycle's AF identity and
dot-path. Only acyclic reference graphs are supported:
cyclic_tf = TF({"x": 1}, frozen=False)
cyclic_tf["self"] = cyclic_tf # cycle
flatten(RF({0: TF({"r": cyclic_tf})}, frozen=True)).result
# → ValueError: flatten: reference cycle detected at AF id=...| Situation | Recommendation |
|---|---|
| Downstream consumer understands FDM nested rows (aggregators, structured predicates) | Stay with join output — object identity and zero-redundancy are preserved. |
| Compatibility with SQL-shaped tools or a pandas DataFrame | Use flatten(join(dbf)) — accepts the value duplication deliberately. |
| Only one relation in the DBF | join + flatten still works; output keys are "relation.attribute" prefixed. |
| Operator | Form | Shape |
|---|---|---|
| join | DBF → RF | One row per surviving tuple combination; each row is a nested TF({relation: tf, …}) — FDM-native, zero-redundancy. |
| flatten | RF → RF | One flat TF({"rel.attr": scalar, …}) per row — SQL-style, value duplication accepted. |