Add multi-segment with_count support and ORM demos by seapagan · Pull Request #137 · seapagan/sqliter-py

seapagan · 2026-02-22T12:55:06Z

Summary

add multi-segment with_count(path, alias, distinct) traversal support
support mixed relationship paths (forward FK intermediates + reverse FK/M2M terminals)
enforce terminal to-many relationship validation for with_count
add shared path-prefix join reuse for repeated with_count calls
add coverage for multi-segment paths, validation, and distinct/raw semantics
add ORM TUI demos for single-segment and multi-segment with_count
update guide/API/docs demo pages to reflect new behavior

Behavior notes

terminal path segment for with_count must be reverse FK or M2M
distinct=False keeps raw row-count semantics
distinct=True counts distinct terminal PKs

Validation

poe test -> 990 passed, 1 skipped
poe docs:build passes

Summary by CodeRabbit

New Features
- with_count() now supports nested relationship paths (e.g. "author__books" or "articles__tags"); terminal segment must be to-many.
Documentation
- Updated guidance, examples and notes on nested paths, distinct counting and projection behaviour; minor reformatting and a duplicated example section addressed.
Demos
- Added two ORM demos showing basic and multi‑segment with_count() usage.
Tests
- Added multi‑segment, error‑condition and join‑reuse tests.
Bug Fixes
- Clarified error messages for invalid relationship paths and unresolved M2M metadata.

coderabbitai · 2026-02-22T12:55:14Z

No actionable comments were generated in the recent review. 🎉

📝 Walkthrough

Walkthrough

This PR extends with_count() to support nested relationship paths (e.g., "author__books" or "articles__tags") requiring the terminal segment to be to-many; adds a frozen dataclass WithCountJoinNode, a resolver for path segments, forward/reverse/M2M join builders, and a cache for join metadata per path.

Changes

Cohort / File(s)	Summary
Documentation `docs/api-reference/query-builder.md`, `docs/guide/aggregates.md`, `docs/tui-demo/index.md`, `docs/tui-demo/orm.md`	Documented nested `with_count()` paths and terminal-to-many requirement; added multi-segment examples, distinct-count guidance, minor reformatting/wording standardisation and a duplicated demo section.
Query builder core `sqliter/query/query.py`	Added `WithCountJoinNode` dataclass and `_with_count_join_nodes` cache; added `_resolve_with_count_segment()` and `_resolve_with_count_path()` resolvers; implemented `_build_forward_fk_with_count_join()`, `_build_reverse_fk_with_count_join()`, `_build_m2m_with_count_join()`, and `_build_with_count_target_sql()`; removed legacy `_build_reverse_fk_count_target()` and `_build_m2m_count_target()`; resolver-driven join construction and caching implemented.
TUI demos `sqliter/tui/demos/orm.py`	Added demo runners `_run_with_count_basic()` and `_run_with_count_multi_segment()` and registered Demo entries demonstrating single- and multi-segment `with_count()` usage (including distinct counts).
Tests `tests/test_aggregates.py`	Added test model `BookCustomColAgg` (custom FK db_column) and multiple tests covering multi-segment `with_count()` (forward→reverse, M2M distinct semantics, shared-prefix join reuse), custom db_column reverse-FK behavior, and expanded invalid-path/unresolved-metadata error tests.
Notes / TODO `TODO.md`	Removed a planned extension note; added note about supporting `ForeignKey(..., db_column=...)` across ORM runtime and rewriting temporary tests once implemented.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Client / API
    participant Resolver as Path Resolver
    participant Cache as Join Cache
    participant FKBuilder as FK Join Builder
    participant M2MBuilder as M2M Join Builder
    participant SQLBuilder as SQL Builder

    Client->>Resolver: resolve path "author__books"
    activate Resolver
    Resolver->>Resolver: split & classify segments
    Resolver-->>Client: [("author", forward_fk, desc), ("books", reverse_fk, desc)]
    deactivate Resolver

    Client->>Cache: check cached join nodes for prefix "author"
    activate Cache
    Cache-->>Client: miss
    deactivate Cache

    Client->>FKBuilder: build forward-FK join for "author"
    activate FKBuilder
    FKBuilder->>Cache: store WithCountJoinNode(alias_A, model_A)
    FKBuilder-->>Client: alias_A
    deactivate FKBuilder

    Client->>FKBuilder: build reverse-FK join for "books" from alias_A
    activate FKBuilder
    FKBuilder->>Cache: store WithCountJoinNode(alias_B, model_B)
    FKBuilder-->>Client: alias_B
    deactivate FKBuilder

    Client->>SQLBuilder: build COUNT target using terminal alias_B
    activate SQLBuilder
    SQLBuilder->>Cache: retrieve join nodes for SQL fragments
    SQLBuilder-->>Client: SQL COUNT(...) + JOIN fragments
    deactivate SQLBuilder

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Add aggregate/group-by projection queries and with_count helper #135 — Refactors and extends QueryBuilder.with_count; closely related to the new resolver-based multi-segment implementation.
Expose read-only ManyToMany SQL metadata #134 — Adds M2M SQL metadata and sql_metadata accessors which the new M2M join-building logic depends on.
feat: add many-to-many relationship support #116 — Implements many-to-many metadata/registry used by with_count path-resolution and M2M join-building.

Suggested labels

documentation

Poem

🐇 Hopping through segments, quick and spry,
I tally books beneath the sky,
From forward FK to many-to-many night,
Joins cached snug, counts crisp and bright,
I nibble bugs and make SQL light.

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and accurately summarises the main changes: adding multi-segment with_count() support and new ORM demonstration examples.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feature/with-count-multisegment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codacy-production · 2026-02-22T12:55:48Z

Coverage summary from Codacy

See diff coverage on Codacy

Coverage variation	Diff coverage
✅ +0.00% (target: -1.00%)	✅ 100.00%

Coverage variation details

	Coverable lines	Covered lines	Coverage
Common ancestor commit (`1f2381d`)	4676	4676	100.00%
Head commit (`4c0c612`)	4792 (+116)	4792 (+116)	100.00% (+0.00%)

Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: <coverage of head commit> - <coverage of common ancestor commit>

Diff coverage details

	Coverable lines	Covered lines	Diff coverage
Pull request (#137)	137	137	100.00%

Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: <covered lines added or modified>/<coverable lines added or modified> * 100%

See your quality gate settings Change summary preferences

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@sqliter/query/query.py`:
- Around line 614-632: The reverse-FK JOIN currently hardcodes fk_column =
f"{descriptor.fk_field}_id" in _build_reverse_fk_with_count_join which breaks
when a ForeignKey defines a custom DB column; change it to derive the actual
column name from the FK descriptor (e.g., use descriptor.column or
descriptor.db_column or the descriptor-provided attribute that holds the DB
column name) and use that value for fk_column (with a safe fallback to
f"{descriptor.fk_field}_id" if the descriptor lacks the attribute), then build
join_clause using that derived fk_column instead of the hardcoded pattern.

sqliter/query/query.py

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

TODO.md (1)
31-34: ⚠️ Potential issue | 🟡 Minor

Stale TODO — this item has been implemented by this PR.

The :fire: item describing multi-segment with_count() support (mixed reverse FK + M2M, distinct semantics, LEFT JOIN zero-row preservation, join plan reuse) is exactly what this PR delivers. It should be removed to keep the backlog accurate.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@TODO.md` around lines 31 - 34, Remove the stale TODO bullet about extending
with_count() from TODO.md since this PR implements it; specifically delete the
line starting with "🔥 Extend `with_count()` to support full multi-segment
relationship paths..." so the backlog no longer lists the already-implemented
with_count() multi-segment/reverse FK/M2M distinct/LEFT JOIN/join-planning item.

🧹 Nitpick comments (2)

sqliter/query/query.py (2)
699-701: Redundant terminal-alias lookup — parent_alias already holds the value.

After the for loop, parent_alias is assigned the last node's .alias. The two-step derivation via resolved_path[-1][0] → _with_count_join_nodes[terminal_path] repeats that same value needlessly.
♻️ Proposed simplification
-    terminal_path = resolved_path[-1][0]
-    terminal_alias = self._with_count_join_nodes[terminal_path].alias
-    return f'{terminal_alias}."pk"'
+    return f'{parent_alias}."pk"'
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sqliter/query/query.py` around lines 699 - 701, The terminal-alias lookup is
redundant: use the already-assigned parent_alias instead of re-deriving
terminal_alias via resolved_path[-1][0] and self._with_count_join_nodes; update
the return expression in the method that computes the terminal pk (the block
using terminal_path/terminal_alias/resolved_path and parent_alias) to return
parent_alias."pk" (or equivalent string assembly using parent_alias) so you
avoid the duplicate lookup through _with_count_join_nodes.
123-129: Remove the unused is_to_many field from WithCountJoinNode.

All three join builders (_build_forward_fk_with_count_join, _build_reverse_fk_with_count_join, _build_m2m_with_count_join) populate this field, but it is never read anywhere in the codebase. The relationship type information is already conveyed by the kind field returned from _resolve_with_count_path, making this field redundant.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sqliter/query/query.py` around lines 123 - 129, Remove the redundant
is_to_many attribute from the WithCountJoinNode dataclass and eliminate any
assignments to it in the join builders; specifically update class
WithCountJoinNode to only include alias and model_class, and modify
_build_forward_fk_with_count_join, _build_reverse_fk_with_count_join, and
_build_m2m_with_count_join to stop populating is_to_many (use the existing kind
value from _resolve_with_count_path where relationship type is needed). Ensure
any construction sites of WithCountJoinNode are updated to the new two-argument
signature and that no other code references is_to_many.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@TODO.md`:
- Around line 31-34: Remove the stale TODO bullet about extending with_count()
from TODO.md since this PR implements it; specifically delete the line starting
with "🔥 Extend `with_count()` to support full multi-segment relationship
paths..." so the backlog no longer lists the already-implemented with_count()
multi-segment/reverse FK/M2M distinct/LEFT JOIN/join-planning item.

---

Nitpick comments:
In `@sqliter/query/query.py`:
- Around line 699-701: The terminal-alias lookup is redundant: use the
already-assigned parent_alias instead of re-deriving terminal_alias via
resolved_path[-1][0] and self._with_count_join_nodes; update the return
expression in the method that computes the terminal pk (the block using
terminal_path/terminal_alias/resolved_path and parent_alias) to return
parent_alias."pk" (or equivalent string assembly using parent_alias) so you
avoid the duplicate lookup through _with_count_join_nodes.
- Around line 123-129: Remove the redundant is_to_many attribute from the
WithCountJoinNode dataclass and eliminate any assignments to it in the join
builders; specifically update class WithCountJoinNode to only include alias and
model_class, and modify _build_forward_fk_with_count_join,
_build_reverse_fk_with_count_join, and _build_m2m_with_count_join to stop
populating is_to_many (use the existing kind value from _resolve_with_count_path
where relationship type is needed). Ensure any construction sites of
WithCountJoinNode are updated to the new two-argument signature and that no
other code references is_to_many.

coderabbitai

🧹 Nitpick comments (1)

sqliter/query/query.py (1)

502-588: Add a guard for unresolved forward-FK to_model string references.

The M2M branches (lines 561–577) explicitly raise InvalidProjectionError when to_model is still a string, but the forward_fk branch (lines 556–558) silently does a bare cast. If a FK descriptor's to_model hasn't been resolved yet, subsequent calls to _resolve_with_count_segment will call current_model.__name__ on a plain str object and raise AttributeError rather than the expected InvalidProjectionError.

🛡️ Suggested guard (mirrors the existing M2M pattern)

         if kind == "forward_fk":
+            if isinstance(descriptor.to_model, str):
+                msg = (
+                    "Cannot resolve model class for forward FK relationship "
+                    f"'{segment}' in path '{path}'."
+                )
+                raise InvalidProjectionError(msg)
             current_model = cast("type[BaseDBModel]", descriptor.to_model)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@sqliter/query/query.py` around lines 502 - 588, The forward-FK branch in
_resolve_with_count_path uses cast on a ForeignKey descriptor.to_model without
guarding against unresolved string references; add a check mirroring the M2M
branches: if kind == "forward_fk" and isinstance(descriptor.to_model, str) then
raise InvalidProjectionError with a message like "Cannot resolve SQL metadata
for relationship '<path>'" (use the same path variable or similar context),
otherwise set current_model = cast("type[BaseDBModel]", descriptor.to_model);
this prevents AttributeError when descriptor.to_model is still a string.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@sqliter/query/query.py`:
- Around line 612-636: The JOIN clause may produce unquoted aliases/identifiers
which can break SQL when aliases or column names need quoting; in
_build_reverse_fk_with_count_join ensure the join alias and parent alias are
quoted consistently and that fk_column (resolved in the fk_descriptor lookup) is
safely used as a quoted identifier in the join_clause construction (refer to
join_clause, fk_column, _next_projection_alias, and WithCountJoinNode) so the
generated SQL always quotes table aliases and column names.

---

Nitpick comments:
In `@sqliter/query/query.py`:
- Around line 502-588: The forward-FK branch in _resolve_with_count_path uses
cast on a ForeignKey descriptor.to_model without guarding against unresolved
string references; add a check mirroring the M2M branches: if kind ==
"forward_fk" and isinstance(descriptor.to_model, str) then raise
InvalidProjectionError with a message like "Cannot resolve SQL metadata for
relationship '<path>'" (use the same path variable or similar context),
otherwise set current_model = cast("type[BaseDBModel]", descriptor.to_model);
this prevents AttributeError when descriptor.to_model is still a string.

coderabbitai

🧹 Nitpick comments (1)

tests/test_aggregates.py (1)

590-603: Redundant manual restore on Line 603.

monkeypatch automatically undoes setattr calls at test teardown, so the explicit restore on line 603 is unnecessary. It's harmless but adds noise.

Suggested removal

     with pytest.raises(
         InvalidProjectionError, match="Cannot resolve SQL metadata"
     ):
         relation_db.select(BookAgg).with_count("author__books")
-
-    monkeypatch.setattr(BookAgg.author, "to_model", original_to_model)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/test_aggregates.py` around lines 590 - 603, The test
test_with_count_rejects_unresolved_forward_fk_target uses monkeypatch.setattr on
BookAgg.author to simulate a missing to_model; remove the redundant manual
restore at the end (the monkeypatch.setattr(BookAgg.author, "to_model",
original_to_model) call) because pytest's monkeypatch fixture automatically
reverts changes at teardown—just keep the monkeypatch.setattr call and the
assertion block, and delete the explicit restoration line.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@tests/test_aggregates.py`:
- Around line 590-603: The test
test_with_count_rejects_unresolved_forward_fk_target uses monkeypatch.setattr on
BookAgg.author to simulate a missing to_model; remove the redundant manual
restore at the end (the monkeypatch.setattr(BookAgg.author, "to_model",
original_to_model) call) because pytest's monkeypatch fixture automatically
reverts changes at teardown—just keep the monkeypatch.setattr call and the
assertion block, and delete the explicit restoration line.

seapagan · 2026-02-22T16:38:46Z

@coderabbitai review

coderabbitai · 2026-02-22T16:38:56Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

tests/test_aggregates.py (1)
629-643: Consider reaching this error path through the public API instead of calling _build_m2m_with_count_join directly.

The test exercises an internal method directly, coupling it to the method's name and exact signature. If the method is renamed or merged as part of a future refactor, this test will break with no public-API regression to catch the gap.

The same error path (missing M2M metadata) could be triggered via with_count using a monkeypatched descriptor on a proper model field, consistent with how test_with_count_rejects_unresolved_forward_fk_target tests a similar code path.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_aggregates.py` around lines 629 - 643, Replace the direct call to
the internal _build_m2m_with_count_join in
test_build_m2m_with_count_join_rejects_missing_metadata with a public-API
invocation: create the same ManyToMany descriptor (ManyToMany(TagAgg)),
attach/monkeypatch it onto the appropriate model field on AuthorAgg (or the
model instance used by relation_db.select(AuthorAgg)), then call
query.with_count("tags") (or the public with_count entrypoint used elsewhere,
e.g. as in test_with_count_rejects_unresolved_forward_fk_target) and assert it
raises InvalidProjectionError with "Cannot resolve SQL metadata"; this keeps the
test exercising the same missing-M2M-metadata path but via the public API rather
than calling _build_m2m_with_count_join directly.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/test_aggregates.py`:
- Around line 217-240: Add an inline comment before the raw-SQL insert that
explains why we bypass the ORM write path (we need to pin author_ref directly so
the test validates only the read/join side of BookCustomColAgg.with_count on
AuthorAgg), and wrap the DB use in a try/finally so db.close() is always called:
ensure the block that opens conn/cursor, executes the raw INSERTs using
BookCustomColAgg.get_table_name(), commits, runs the
db.select(...).with_count("custom_books",
alias="usage").order("name").fetch_dicts() assertions, and then calls db.close()
in a finally clause to guarantee cleanup even if the assertions fail.

---

Nitpick comments:
In `@tests/test_aggregates.py`:
- Around line 629-643: Replace the direct call to the internal
_build_m2m_with_count_join in
test_build_m2m_with_count_join_rejects_missing_metadata with a public-API
invocation: create the same ManyToMany descriptor (ManyToMany(TagAgg)),
attach/monkeypatch it onto the appropriate model field on AuthorAgg (or the
model instance used by relation_db.select(AuthorAgg)), then call
query.with_count("tags") (or the public with_count entrypoint used elsewhere,
e.g. as in test_with_count_rejects_unresolved_forward_fk_target) and assert it
raises InvalidProjectionError with "Cannot resolve SQL metadata"; this keeps the
test exercising the same missing-M2M-metadata path but via the public API rather
than calling _build_m2m_with_count_join directly.

tests/test_aggregates.py

seapagan · 2026-02-22T16:57:58Z

@coderabbitai resume

coderabbitai · 2026-02-22T16:58:07Z

✅ Actions performed

Reviews resumed.

seapagan added 3 commits February 22, 2026 12:33

Add multi-segment with_count join planning and tests

1854a4a

Document multi-segment with_count support

d34212c

Add with_count multi-segment ORM demo and docs

82bf402

seapagan self-assigned this Feb 22, 2026

seapagan added the enhancement New feature or request label Feb 22, 2026

seapagan marked this pull request as ready for review February 22, 2026 12:59