Skip to content

Add aggregate/group-by projection queries and with_count helper#135

Merged
seapagan merged 21 commits intomainfrom
feature/issue-132-aggregate-groupby
Feb 22, 2026
Merged

Add aggregate/group-by projection queries and with_count helper#135
seapagan merged 21 commits intomainfrom
feature/issue-132-aggregate-groupby

Conversation

@seapagan
Copy link
Owner

@seapagan seapagan commented Feb 21, 2026

Summary

  • Add aggregate/group-by projection query support to QueryBuilder
  • Add with_count() relation counting helper (reverse FK + forward/reverse M2M)
  • Add fetch_dicts() projection result API and func aggregate namespace
  • Add full test coverage for new projection/aggregate paths
  • Update guide, API docs, TUI demo docs, and feature summaries

Key API Additions

  • group_by(*fields)
  • annotate(**aggregates)
  • having(**conditions)
  • with_count(path, alias="count", distinct=False)
  • fetch_dicts()
  • sqliter.query.func + AggregateSpec

Behavior Notes

  • Projection mode requires fetch_dicts()
  • with_count() currently supports single-segment relationship paths
  • Added TODO item for future full multi-segment with_count() support

Validation

  • poe test => 975 passed, 1 skipped
  • Coverage reaches 100%
  • poe docs:build passes

Closes #132

Summary by CodeRabbit

  • New Features

    • ORM mode with lazy loading, reverse relations, many-to-many and eager-loading; projection & aggregation support (grouping, aggregate helpers, dictionary results), with_count and projection-mode guards (InvalidProjectionError); optional raw SQL debug logging.
  • Documentation

    • Extensive guides and API reference for aggregates, grouping, HAVING, with_count and fetch_dicts with examples and navigation updates.
  • Tests

    • Comprehensive test coverage for aggregates, grouping, with_count, projection behaviours and error cases.
  • Chores

    • README polish and public export surface updated.

@coderabbitai
Copy link

coderabbitai bot commented Feb 21, 2026

📝 Walkthrough

Walkthrough

Adds projection-based grouping and aggregation to QueryBuilder: new AggregateSpec and func helpers, group_by/annotate/having/with_count/fetch_dicts methods, InvalidProjectionError, projection-aware SQL generation and caching, accompanying docs, demos and extensive tests.

Changes

Cohort / File(s) Summary
Aggregates API
sqliter/query/aggregates.py, sqliter/query/__init__.py
New frozen AggregateSpec, AggregateFunction type and _FunctionNamespace; module-level func factory; AggregateSpec and func re-exported in package __all__.
QueryBuilder / Projection Logic
sqliter/query/query.py
Large feature addition: group_by, annotate, having, with_count, fetch_dicts, projection state and validation, SQL builders for aggregates/HAVING/joins, projection-aware cache keys, and guards that reject non-projection fetch/update/delete in projection mode.
Exceptions & API docs
sqliter/exceptions.py, docs/api-reference/exceptions.md
New InvalidProjectionError class and documentation; projection-related raises documented across projection APIs.
Tests & TUI demos
tests/test_aggregates.py, tests/tui/unit/test_demos_results.py, sqliter/tui/demos/results.py
Comprehensive new tests covering grouping, aggregates, with_count variants, alias/validation and cache behaviour; two new demos added and demo-count assertion tightened.
Docs & Guides
docs/guide/aggregates.md, docs/api-reference/query-builder.md, docs/tui-demo/results.md, docs/index.md, docs/guide/*, README.md, mkdocs.yml
Extensive documentation: new Aggregates & Grouping guide, API reference updates for projection APIs and func helpers, examples, navigation entry and index feature-list updates.
Meta / TODO / Manifest
TODO.md, manifest/docs minor edits
Added TODOs for extending with_count and refactoring filter builders; small manifest/document punctuation edits.

Sequence Diagram(s)

sequenceDiagram
    participant User as User
    participant QB as QueryBuilder
    participant SQL as SQLBuilder
    participant DB as Database
    participant Conv as RowConverter

    User->>QB: group_by("field"), annotate(total=func.sum("amount"))
    QB->>QB: record projection state (group_by, aggregates)
    User->>QB: having(total__gt=100)
    QB->>QB: validate HAVING against grouped/aggregate aliases

    User->>QB: fetch_dicts()
    QB->>SQL: _build_projection_sql() (SELECT, JOINs, GROUP BY, HAVING, ORDER BY, LIMIT)
    SQL->>DB: execute(query, params)
    DB-->>SQL: rows
    SQL->>Conv: _convert_projection_row_to_dict(rows)
    Conv-->>User: list[dict[string, any]]
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Poem

🐇 I nibble at rows and group them with care,
SUMs and COUNTs hop lightly through the air,
Aliases bloom where HAVING trims the throng,
fetch_dicts() returns each tidy song,
Hooray — no raw SQL to carry me along!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and concisely describes the main changes: adding aggregate/group-by projection queries and the with_count helper functionality.
Linked Issues check ✅ Passed The PR fully implements all coding requirements from issue #132: group_by, annotate, having, with_count, func helpers, AggregateSpec, fetch_dicts, and comprehensive validation for projection mode.
Out of Scope Changes check ✅ Passed All changes are within scope: new aggregate/projection APIs, documentation updates, test coverage, and demo enhancements directly support issue #132 objectives. No unrelated changes detected.
Docstring Coverage ✅ Passed Docstring coverage is 96.43% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/issue-132-aggregate-groupby

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codacy-production
Copy link

codacy-production bot commented Feb 21, 2026

Coverage summary from Codacy

See diff coverage on Codacy

Coverage variation Diff coverage
+0.00% (target: -1.00%) 100.00%
Coverage variation details
Coverable lines Covered lines Coverage
Common ancestor commit (21f6852) 4309 4309 100.00%
Head commit (bf16988) 4676 (+367) 4676 (+367) 100.00% (+0.00%)

Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: <coverage of head commit> - <coverage of common ancestor commit>

Diff coverage details
Coverable lines Covered lines Diff coverage
Pull request (#135) 379 379 100.00%

Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: <covered lines added or modified>/<coverable lines added or modified> * 100%

See your quality gate settings    Change summary preferences

@seapagan seapagan self-assigned this Feb 21, 2026
@seapagan seapagan added the enhancement New feature or request label Feb 21, 2026
@seapagan seapagan marked this pull request as ready for review February 21, 2026 18:07
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (5)
docs/api-reference/index.md (1)

25-25: Consider documenting AggregateSpec in the module overview table.

AggregateSpec is also exported from sqliter.query (per __all__ in __init__.py) and is referenced in the annotate() type signature. Users who want type annotations for their aggregates will look here first.

Proposed addition
-| `sqliter.query`      | `from sqliter.query import func`        | QueryBuilder aggregate helpers            |
+| `sqliter.query`      | `from sqliter.query import func`        | QueryBuilder aggregate helpers            |
+| `sqliter.query`      | `from sqliter.query import AggregateSpec` | Aggregate specification type             |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/api-reference/index.md` at line 25, Add an entry for AggregateSpec to
the module overview table so users can find the type when scanning
sqliter.query; specifically, document that AggregateSpec is exported from
sqliter.query (per __all__ in __init__.py) and used by the annotate() type
signature by adding a row similar to the others that names AggregateSpec and
notes it as the type helper for aggregate annotations from sqliter.query.
docs/api-reference/query-builder.md (2)

769-773: Minor: len(fetch_dicts()) as a substitute for count() may be misleading.

In non-projection mode, count() issues a SELECT COUNT(*) without fetching rows. Suggesting len(fetch_dicts()) as the replacement implies fetching all result rows into memory first. If this is the intended (and only) option in projection mode, the docs are correct — but it might be worth a brief note that this loads all rows.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/api-reference/query-builder.md` around lines 769 - 773, Update the
documentation for InvalidProjectionError to clarify that using
len(fetch_dicts()) as an alternative to count() in projection mode will load all
result rows into memory; mention that count() issues SELECT COUNT(*) in
non-projection mode and cannot be used when projection mode is active, and if
there are memory concerns recommend iterating with a streaming iterator or using
a server-side count if supported. Reference the symbols InvalidProjectionError,
count(), fetch_dicts(), and "projection mode" so readers can locate the relevant
behavior.

790-794: exists() projection-mode error has no suggested alternative.

Other fetch methods note "use fetch_dicts() instead" or similar, but exists() just says "If projection mode is active." Consider adding a brief hint (e.g., "use len(fetch_dicts()) > 0 instead") for consistency, or note that existence checks are not supported in projection mode.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/api-reference/query-builder.md` around lines 790 - 794, Update the
docstring for the raises section of exists() to include a suggested alternative
when projection mode is active: mention that InvalidProjectionError is raised
and add a short hint such as "use len(fetch_dicts()) > 0" or "perform a
non-projection fetch (e.g., fetch_dicts()) to check existence" so readers know
how to perform existence checks in projection mode; reference the exists()
method and the fetch_dicts() helper in the wording to keep it consistent with
other fetch method docs.
tests/test_aggregates.py (1)

65-88: Minor fixture duplication between sales_db and sales_db_cached.

Both fixtures insert identical data; the only difference is cache_enabled=True. Consider extracting the common insert logic into a helper to reduce duplication. That said, this is a test file and the duplication is small, so it's entirely optional.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_aggregates.py` around lines 65 - 88, Extract the repeated setup
into a small helper used by both fixtures: create a helper function (e.g.,
populate_sales_db or make_sales_db) that accepts a SqliterDB instance or a
cache_enabled flag and performs db.create_table(Sale) and the five
db.insert(...) calls, then refactor the sales_db and sales_db_cached fixtures to
call this helper (instantiating SqliterDB(":memory:") and SqliterDB(":memory:",
cache_enabled=True) respectively) and return the populated DB; update references
to the helper name in the tests as needed.
sqliter/query/query.py (1)

369-384: Consider validating alias characters to prevent confusing SQL errors.

The alias is interpolated into SQL with double-quote quoting (e.g., AS "{alias}"). If a user passes an alias containing a " character, the generated SQL will be malformed, producing a cryptic sqlite3.OperationalError rather than a clear validation message.

A simple check could prevent this:

🔧 Optional: reject aliases with SQL-unsafe characters
     def _validate_projection_alias(self, alias: str) -> str:
         """Validate and normalize an aggregate projection alias."""
         alias_name = alias.strip()
         if not alias_name:
             msg = "Aggregate alias cannot be empty."
             raise InvalidProjectionError(msg)
+        if not alias_name.isidentifier():
+            msg = (
+                f"Aggregate alias '{alias_name}' is not a valid identifier."
+            )
+            raise InvalidProjectionError(msg)
         if alias_name in self.model_class.model_fields:
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sqliter/query/query.py` around lines 369 - 384, The
_validate_projection_alias method currently allows alias values containing
double-quotes which will break the generated SQL; add a validation after
alias_name = alias.strip() to reject aliases containing the double-quote
character (and optionally other SQL-unsafe characters like null or control
chars) and raise InvalidProjectionError with a clear message; update the check
in _validate_projection_alias (referencing self.model_class.model_fields and
self._aggregates) so any alias with '"' (or disallowed chars) is rejected before
further conflict checks.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@sqliter/query/query.py`:
- Around line 1572-1596: The cache key used in _fetch_projection_result is
non-deterministic because _projection_columns (and related derived state) is
populated as a side-effect of _execute_projection_query, causing the initial
cache lookup in _fetch_projection_result to always miss; update _make_cache_key
to remove _projection_columns and _projection_join_clauses from the generated
key (they are derivable from _group_by, _aggregates, and
_aggregate_sql_expressions which remain in the key) so keys are deterministic
across QueryBuilder instances, ensuring the first lookup can hit and preventing
cross-object cache misses.

---

Nitpick comments:
In `@docs/api-reference/index.md`:
- Line 25: Add an entry for AggregateSpec to the module overview table so users
can find the type when scanning sqliter.query; specifically, document that
AggregateSpec is exported from sqliter.query (per __all__ in __init__.py) and
used by the annotate() type signature by adding a row similar to the others that
names AggregateSpec and notes it as the type helper for aggregate annotations
from sqliter.query.

In `@docs/api-reference/query-builder.md`:
- Around line 769-773: Update the documentation for InvalidProjectionError to
clarify that using len(fetch_dicts()) as an alternative to count() in projection
mode will load all result rows into memory; mention that count() issues SELECT
COUNT(*) in non-projection mode and cannot be used when projection mode is
active, and if there are memory concerns recommend iterating with a streaming
iterator or using a server-side count if supported. Reference the symbols
InvalidProjectionError, count(), fetch_dicts(), and "projection mode" so readers
can locate the relevant behavior.
- Around line 790-794: Update the docstring for the raises section of exists()
to include a suggested alternative when projection mode is active: mention that
InvalidProjectionError is raised and add a short hint such as "use
len(fetch_dicts()) > 0" or "perform a non-projection fetch (e.g., fetch_dicts())
to check existence" so readers know how to perform existence checks in
projection mode; reference the exists() method and the fetch_dicts() helper in
the wording to keep it consistent with other fetch method docs.

In `@sqliter/query/query.py`:
- Around line 369-384: The _validate_projection_alias method currently allows
alias values containing double-quotes which will break the generated SQL; add a
validation after alias_name = alias.strip() to reject aliases containing the
double-quote character (and optionally other SQL-unsafe characters like null or
control chars) and raise InvalidProjectionError with a clear message; update the
check in _validate_projection_alias (referencing self.model_class.model_fields
and self._aggregates) so any alias with '"' (or disallowed chars) is rejected
before further conflict checks.

In `@tests/test_aggregates.py`:
- Around line 65-88: Extract the repeated setup into a small helper used by both
fixtures: create a helper function (e.g., populate_sales_db or make_sales_db)
that accepts a SqliterDB instance or a cache_enabled flag and performs
db.create_table(Sale) and the five db.insert(...) calls, then refactor the
sales_db and sales_db_cached fixtures to call this helper (instantiating
SqliterDB(":memory:") and SqliterDB(":memory:", cache_enabled=True)
respectively) and return the populated DB; update references to the helper name
in the tests as needed.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
sqliter/query/query.py (1)

2429-2435: ⚠️ Potential issue | 🟡 Minor

exists() lacks a direct projection guard — error message names the wrong method.

Calling exists() in projection mode raises InvalidProjectionError via count(), but the message reads "count() is unavailable for projection queries. Use len(fetch_dicts()) instead." A user who called exists() will find this confusing.

🐛 Proposed fix
 def exists(self) -> bool:
     """Check if any results exist for the current query."""
+    if self._projection_mode:
+        msg = (
+            "exists() is unavailable for projection queries. "
+            "Use len(fetch_dicts()) > 0 instead."
+        )
+        raise InvalidProjectionError(msg)
     return self.count() > 0
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sqliter/query/query.py` around lines 2429 - 2435, The exists() method
currently calls count() and so surfaces an InvalidProjectionError whose message
references count() instead of the caller; add a projection-mode guard at the
start of exists() that mirrors the check used in count() (detect projection mode
via the same internal flag used by count(), e.g., the projection indicator used
in the query class) and raise InvalidProjectionError with a message that
correctly references exists() (suggesting an alternative like len(fetch_dicts())
or using fetch_one()). This keeps count() unchanged but ensures exists() fails
with an accurate, helpful error message when in projection mode.
🧹 Nitpick comments (2)
tests/test_aggregates.py (1)

499-515: Consider adding "fetch_all" to the parametrised list for completeness.

fetch_all is currently tested in a separate non-parametrised test (test_projection_mode_rejects_model_fetch_methods). Including it in the parametrised tuple would consolidate all five projection-rejected methods in one place.

♻️ Proposed refactor
 `@pytest.mark.parametrize`(
     "method_name",
-    ["fetch_one", "fetch_first", "fetch_last", "count"],
+    ["fetch_all", "fetch_one", "fetch_first", "fetch_last", "count"],
 )

The standalone test_projection_mode_rejects_model_fetch_methods test could then be removed.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_aggregates.py` around lines 499 - 515, Add "fetch_all" to the
parametrized "method_name" list in
test_projection_mode_rejects_other_model_fetch_methods so it covers all five
model-oriented fetch/count methods (fetch_one, fetch_first, fetch_last,
fetch_all, count), then delete the now-redundant standalone
test_projection_mode_rejects_model_fetch_methods; keep the assertion logic and
InvalidProjectionError expectation unchanged in
test_projection_mode_rejects_other_model_fetch_methods.
sqliter/query/query.py (1)

411-438: _projection_mode is set after the loop; partial _aggregates state on mid-loop exception.

If annotate() is called with a mix of valid and invalid arguments (e.g. annotate(good=func.sum("amount"), bad="not-spec")), earlier aliases are written to _aggregates but _projection_mode remains False because the assignment is only reached after the loop completes. If the caller catches the exception and reuses the query, they encounter duplicate-alias errors on the next annotate() call and a silent state where fetch_all() succeeds without applying the partial aggregates.

Moving self._projection_mode = True to before the loop is the minimal fix.

♻️ Proposed fix
 def annotate(self, **aggregates: AggregateSpec) -> Self:
     """Add aggregate projections to the query."""
     if not aggregates:
         return self
 
+    self._projection_mode = True
     for raw_alias, aggregate_spec in aggregates.items():
         alias = self._validate_projection_alias(raw_alias)
         ...
         self._aggregates[alias] = aggregate_spec
 
-    self._projection_mode = True
     return self
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sqliter/query/query.py` around lines 411 - 438, The annotate method can leave
_aggregates partially populated if an exception occurs mid-loop because
self._projection_mode is only set after the loop; move the assignment
self._projection_mode = True to immediately before the for-loop in annotate so
projection mode is enabled atomically with starting aggregate population,
validate aliases and AggregateSpec instances as before (use
_validate_projection_alias and isinstance(..., AggregateSpec)), and keep raising
InvalidProjectionError for invalid field names; this ensures either the query is
fully in projection mode with all aggregates or nothing is applied on exception.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/api-reference/query-builder.md`:
- Around line 489-493: The documentation sentence under with_count() is missing
a comma after the parenthetical phrase "for example"; update the text that reads
'(for example "books")' to '(for example, "books")' in the docs entry for
with_count() so the example is punctuated correctly and reads naturally.

---

Outside diff comments:
In `@sqliter/query/query.py`:
- Around line 2429-2435: The exists() method currently calls count() and so
surfaces an InvalidProjectionError whose message references count() instead of
the caller; add a projection-mode guard at the start of exists() that mirrors
the check used in count() (detect projection mode via the same internal flag
used by count(), e.g., the projection indicator used in the query class) and
raise InvalidProjectionError with a message that correctly references exists()
(suggesting an alternative like len(fetch_dicts()) or using fetch_one()). This
keeps count() unchanged but ensures exists() fails with an accurate, helpful
error message when in projection mode.

---

Nitpick comments:
In `@sqliter/query/query.py`:
- Around line 411-438: The annotate method can leave _aggregates partially
populated if an exception occurs mid-loop because self._projection_mode is only
set after the loop; move the assignment self._projection_mode = True to
immediately before the for-loop in annotate so projection mode is enabled
atomically with starting aggregate population, validate aliases and
AggregateSpec instances as before (use _validate_projection_alias and
isinstance(..., AggregateSpec)), and keep raising InvalidProjectionError for
invalid field names; this ensures either the query is fully in projection mode
with all aggregates or nothing is applied on exception.

In `@tests/test_aggregates.py`:
- Around line 499-515: Add "fetch_all" to the parametrized "method_name" list in
test_projection_mode_rejects_other_model_fetch_methods so it covers all five
model-oriented fetch/count methods (fetch_one, fetch_first, fetch_last,
fetch_all, count), then delete the now-redundant standalone
test_projection_mode_rejects_model_fetch_methods; keep the assertion logic and
InvalidProjectionError expectation unchanged in
test_projection_mode_rejects_other_model_fetch_methods.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
sqliter/query/query.py (1)

2454-2484: ⚠️ Potential issue | 🟡 Minor

delete() and update() silently ignore projection mode.

When a user chains group_by(), annotate(), or with_count() before delete() or update(), the projection state is silently discarded and the operations run against the base table using only the WHERE clause. This contradicts the pattern established by fetch methods (fetch_all, fetch_one, fetch_first, fetch_last, count, exists), which all raise InvalidProjectionError when called in projection mode. Add a projection-mode guard to maintain API consistency.

Proposed guard (example for delete)
 def delete(self) -> int:
+    if self._projection_mode:
+        msg = (
+            "delete() is unavailable for projection queries."
+        )
+        raise InvalidProjectionError(msg)
     sql = f'DELETE FROM "{self.table_name}"'
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sqliter/query/query.py` around lines 2454 - 2484, The delete() (and similarly
update()) method currently ignores projection mode (e.g., after group_by(),
annotate(), with_count()) and executes against the base table; add a guard at
the start of Query.delete() and Query.update() that checks the projection state
(e.g., self._projection_mode or whatever flag the query uses for projection) and
raises InvalidProjectionError(self.table_name) when projection mode is active,
matching the behavior of
fetch_all/fetch_one/fetch_first/fetch_last/count/exists; place this check before
building the SQL (immediately after the docstring/start of the method) so
projection queries cannot be executed as plain deletes/updates.
🧹 Nitpick comments (1)
sqliter/query/query.py (1)

1362-1444: Consider unifying condition builders to reduce duplication.

The new _build_sql_condition / _build_in_condition / _build_like_condition methods duplicate the logic already present in _handle_equality, _handle_in, _handle_like, and _handle_comparison. The existing handlers could be refactored to delegate to these new return-value-based builders, centralising the operator/type validation in one place.

Not urgent for this PR, but worth tracking to prevent the two code paths from drifting apart over time.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sqliter/query/query.py` around lines 1362 - 1444, The new return-value-based
builders (_build_sql_condition, _build_in_condition, _build_like_condition)
duplicate logic in the existing handlers (_handle_equality, _handle_in,
_handle_like, _handle_comparison); refactor those handlers to delegate to these
builders: replace their inline validation/SQL construction with calls to the
appropriate builder, accept the (clause, params) tuple returned, and reuse that
for assembling final query fragments; ensure you preserve exact error
types/messages, parameter ordering, and COLLATE/NULL behavior when moving logic,
and remove the now-duplicated code in the handlers so all operator/type
validation is centralized in the _build_* functions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@sqliter/query/query.py`:
- Around line 2454-2484: The delete() (and similarly update()) method currently
ignores projection mode (e.g., after group_by(), annotate(), with_count()) and
executes against the base table; add a guard at the start of Query.delete() and
Query.update() that checks the projection state (e.g., self._projection_mode or
whatever flag the query uses for projection) and raises
InvalidProjectionError(self.table_name) when projection mode is active, matching
the behavior of fetch_all/fetch_one/fetch_first/fetch_last/count/exists; place
this check before building the SQL (immediately after the docstring/start of the
method) so projection queries cannot be executed as plain deletes/updates.

---

Duplicate comments:
In `@docs/api-reference/query-builder.md`:
- Around line 489-493: The doc sentence in the with_count() note needs a comma
after the phrase "for example" to read "for example, `\"books\"`"; update the
text in the note that references with_count() and group_by() so it reads "for
example, `\"books\"`" (ensure the comma is placed immediately after "for
example") to improve readability.

---

Nitpick comments:
In `@sqliter/query/query.py`:
- Around line 1362-1444: The new return-value-based builders
(_build_sql_condition, _build_in_condition, _build_like_condition) duplicate
logic in the existing handlers (_handle_equality, _handle_in, _handle_like,
_handle_comparison); refactor those handlers to delegate to these builders:
replace their inline validation/SQL construction with calls to the appropriate
builder, accept the (clause, params) tuple returned, and reuse that for
assembling final query fragments; ensure you preserve exact error
types/messages, parameter ordering, and COLLATE/NULL behavior when moving logic,
and remove the now-duplicated code in the handlers so all operator/type
validation is centralized in the _build_* functions.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
sqliter/query/query.py (2)

370-391: _validate_projection_alias – minor type annotation nit on _build_null_condition.

_build_null_condition (line 1363) accepts value: FilterValue but is always called with a pre-computed bool from _build_sql_condition (line 1419). Renaming the parameter to value: bool would make the intent explicit.

🔧 Suggested annotation tightening
 `@staticmethod`
-def _build_null_condition(field_sql: str, value: FilterValue) -> str:
+def _build_null_condition(field_sql: str, value: bool) -> str:
     """Build an IS NULL/IS NOT NULL clause for null operators."""
     is_null = bool(value)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sqliter/query/query.py` around lines 370 - 391, The parameter type for
_build_null_condition should be tightened to bool instead of FilterValue because
callers like _build_sql_condition always pass a boolean; update the signature of
_build_null_condition to accept value: bool and adjust any related type
imports/annotations accordingly, then run type checks to ensure
_build_sql_condition (and any other callers) still match the new signature (no
runtime changes required).

1730-1738: order() error message mentions aggregate aliases unconditionally, even outside projection mode.

When _projection_mode is False, is_projection_alias is always False and the error message "or aggregate aliases" is noise for users who have never used the aggregation API.

♻️ Suggested context-aware message
-        err = (
-            f"'{order_by_field}' does not exist in model fields"
-            " or aggregate aliases."
-        )
+        err = (
+            f"'{order_by_field}' does not exist in model fields"
+            + (
+                " or aggregate aliases."
+                if self._projection_mode
+                else "."
+            )
+        )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sqliter/query/query.py` around lines 1730 - 1738, The error message for
unknown order_by_field always mentions "aggregate aliases" even when
self._projection_mode is False; update the logic around where err is constructed
in the order()/ordering path so the message is context-aware: check
self._projection_mode and build the err string to include "or aggregate aliases"
only when self._projection_mode is True (and order_by_field not in
self._aggregates); otherwise produce a simpler message that only says the field
does not exist in model fields. Use the existing symbols order_by_field,
self._projection_mode, self._aggregates and self.model_class.model_fields to
determine which variant to log.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@sqliter/query/query.py`:
- Around line 1520-1537: _build_projection_sql currently omits regular joins
stored in _join_info which causes filters that traverse relationships to
reference non-joined aliases; to fix, add an explicit guard in
_handle_relationship_filter that detects when projection mode is active (i.e.
when building a projection via _build_projection_sql / when
_projection_join_clauses or projection state is set) and raise a clear error
(e.g., InvalidQueryError or similar) rejecting relationship traversal filters in
projection mode, with a message guiding the user to remove projection or move
the filter to a non-projection query; update any tests/examples for
filter(related__field=...) and mention select_related()/with_count()
interactions.

---

Duplicate comments:
In `@sqliter/query/query.py`:
- Around line 1589-1613: In _fetch_projection_result, avoid recomputing the same
cache key: reuse the cache_key computed before the lookup (from
self._make_cache_key(fetch_one=False)) when calling self.db._cache_set instead
of calling _make_cache_key again; update the block around the lookup and store
so the single cache_key variable (from _make_cache_key) is used for both
db._cache_get and db._cache_set to eliminate the redundant json.dumps/sha256
recomputation.

---

Nitpick comments:
In `@sqliter/query/query.py`:
- Around line 370-391: The parameter type for _build_null_condition should be
tightened to bool instead of FilterValue because callers like
_build_sql_condition always pass a boolean; update the signature of
_build_null_condition to accept value: bool and adjust any related type
imports/annotations accordingly, then run type checks to ensure
_build_sql_condition (and any other callers) still match the new signature (no
runtime changes required).
- Around line 1730-1738: The error message for unknown order_by_field always
mentions "aggregate aliases" even when self._projection_mode is False; update
the logic around where err is constructed in the order()/ordering path so the
message is context-aware: check self._projection_mode and build the err string
to include "or aggregate aliases" only when self._projection_mode is True (and
order_by_field not in self._aggregates); otherwise produce a simpler message
that only says the field does not exist in model fields. Use the existing
symbols order_by_field, self._projection_mode, self._aggregates and
self.model_class.model_fields to determine which variant to log.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
tests/test_aggregates.py (1)

375-385: Consider adding a test for having() called before annotate() with a later-defined aggregate alias.

The existing test covers fields that are never defined as grouped or aggregate. A complementary case — calling having(total__gt=10) before annotate(total=...) — would hit the same error path but with a valid alias that simply hasn't been declared yet. Adding this test would document the ordering constraint and provide a regression anchor if the error message is improved.

💡 Suggested test
def test_having_rejects_aggregate_alias_defined_after_having_call(
    sales_db: SqliterDB,
) -> None:
    """having() must be called after annotate() for aggregate aliases."""
    with pytest.raises(
        InvalidProjectionError, match="must be a grouped field or an aggregate alias"
    ):
        (
            sales_db.select(Sale)
            .group_by("category")
            .having(total__gt=10)   # 'total' not yet defined
            .annotate(total=func.sum("amount"))
        )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_aggregates.py` around lines 375 - 385, Add a new test function
(e.g., test_having_rejects_aggregate_alias_defined_after_having_call) that
asserts InvalidProjectionError is raised when having() is called before
annotate() using an aggregate alias defined later; replicate the pattern in the
existing test by calling
sales_db.select(Sale).group_by("category").having(total__gt=10).annotate(total=func.sum("amount"))
and expect the error message to match "must be a grouped field or an aggregate
alias" to document the ordering constraint between having() and annotate().
sqliter/query/query.py (1)

454-479: having() field validation has an undocumented ordering dependency on annotate() / with_count().

having() validates field names against self._group_by and self._aggregates eagerly at call time. This means calling having() before annotate() or with_count() raises a misleading error for valid aggregate aliases:

# Raises InvalidProjectionError("HAVING field 'total' must be a grouped field or an aggregate alias.")
# even though 'total' will be a valid alias once annotate() is called.
db.select(Sale).group_by("category").having(total__gt=10).annotate(total=func.sum("amount"))

The error message doesn't communicate the ordering requirement. This is an optional UX improvement — either defer field-name validation to _build_projection_sql (consistent with how _join_info checks are deferred), or update the error message to indicate the alias must be defined via annotate()/with_count() before having().

💡 Proposed: clearer error message (minimal change)
-                msg = (
-                    f"HAVING field '{field_name}' must be a grouped field or "
-                    "an aggregate alias."
-                )
+                msg = (
+                    f"HAVING field '{field_name}' must be a grouped field or "
+                    "an aggregate alias. Call annotate()/with_count() before "
+                    "having() when referencing aggregate aliases."
+                )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sqliter/query/query.py` around lines 454 - 479, The current having() eagerly
validates field names against self._group_by and self._aggregates and raises an
unclear InvalidProjectionError if an aggregate alias hasn't yet been defined;
update having() to either defer this validation to _build_projection_sql (so
annotate()/with_count() can add aliases before validation) or, for a minimal
change, improve the error message to mention the ordering requirement explicitly
(e.g., state that aggregate aliases must be defined via annotate()/with_count()
before calling having()); reference the having method, the _group_by and
_aggregates sets, and consider moving validation into _build_projection_sql or
changing the raised message text to guide users to call annotate()/with_count()
first.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@sqliter/query/query.py`:
- Around line 454-479: The current having() eagerly validates field names
against self._group_by and self._aggregates and raises an unclear
InvalidProjectionError if an aggregate alias hasn't yet been defined; update
having() to either defer this validation to _build_projection_sql (so
annotate()/with_count() can add aliases before validation) or, for a minimal
change, improve the error message to mention the ordering requirement explicitly
(e.g., state that aggregate aliases must be defined via annotate()/with_count()
before calling having()); reference the having method, the _group_by and
_aggregates sets, and consider moving validation into _build_projection_sql or
changing the raised message text to guide users to call annotate()/with_count()
first.

In `@tests/test_aggregates.py`:
- Around line 375-385: Add a new test function (e.g.,
test_having_rejects_aggregate_alias_defined_after_having_call) that asserts
InvalidProjectionError is raised when having() is called before annotate() using
an aggregate alias defined later; replicate the pattern in the existing test by
calling
sales_db.select(Sale).group_by("category").having(total__gt=10).annotate(total=func.sum("amount"))
and expect the error message to match "must be a grouped field or an aggregate
alias" to document the ordering constraint between having() and annotate().

@seapagan
Copy link
Owner Author

@coderabbitai review

@coderabbitai
Copy link

coderabbitai bot commented Feb 22, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request adds comprehensive aggregate and grouping query support to sqliter, addressing issue #132. The implementation adds SQL-style projection queries with GROUP BY, aggregate functions, HAVING clauses, and relationship counting via LEFT JOIN.

Changes:

  • Added projection query API with group_by(), annotate(), having(), and fetch_dicts() methods
  • Added with_count() helper for counting related records with zero-count preservation via LEFT JOIN
  • Added func namespace with aggregate helpers (count, sum, avg, min, max) and AggregateSpec type
  • Added comprehensive test suite (691 lines) with 100% coverage of new projection paths
  • Updated documentation (guides, API reference, TUI demos) with examples and usage patterns

Reviewed changes

Copilot reviewed 19 out of 19 changed files in this pull request and generated no comments.

Show a summary per file
File Description
tests/test_aggregates.py Comprehensive test suite for aggregate/projection features with 691 lines covering happy paths, error handling, edge cases, caching, and validation
sqliter/query/query.py Core implementation of projection mode, SQL generation for GROUP BY/HAVING/aggregates, LEFT JOIN building for with_count(), and projection result fetching
sqliter/query/aggregates.py New module defining AggregateSpec dataclass and func namespace with type-safe aggregate helpers
sqliter/query/init.py Exports AggregateSpec and func for public API
sqliter/exceptions.py Added InvalidProjectionError exception for projection-specific validation errors
sqliter/tui/demos/results.py Updated aggregates demo to showcase new projection API, added group-by-only and projection guard demos
tests/tui/unit/test_demos_results.py Updated test to expect 8 demos (was 6, added 2 new ones)
docs/guide/aggregates.md New comprehensive guide covering projection queries, aggregate helpers, HAVING filters, and with_count()
docs/guide/guide.md Added aggregates/grouping section to main guide
docs/guide/filtering.md Added note about HAVING reusing filter operators
docs/guide/data-operations.md Added cross-reference to aggregates guide
docs/tui-demo/results.md Added extensive documentation for new aggregate/projection demos with code examples
docs/api-reference/query-builder.md Added API documentation for projection methods, types, and fetch_dicts()
docs/api-reference/index.md Updated module overview table to include query exports
docs/api-reference/exceptions.md Documented InvalidProjectionError with usage examples
docs/index.md Updated feature list to highlight projection/aggregation capabilities
README.md Updated feature list with new projection/aggregation support
mkdocs.yml Added aggregates guide to navigation
TODO.md Added TODO for extending with_count() to multi-segment paths and noted filter refactoring need

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@seapagan seapagan merged commit bc16692 into main Feb 22, 2026
19 checks passed
@seapagan seapagan deleted the feature/issue-132-aggregate-groupby branch February 22, 2026 08:06
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
tests/test_aggregates.py (1)

260-263: Consider adding a test for AggregateSpec(func="SUM", field="*").

This test covers AggregateSpec(func="SUM") (no field), but not AggregateSpec(func="SUM", field="*"). If AggregateSpec only validates that non-COUNT functions must provide a non-None field (rather than a non-None and non-"*" field), SUM(*) would slip through annotate() undetected.

🧪 Suggested additional case
 def test_aggregate_spec_requires_field_for_non_count() -> None:
     """SUM/MIN/MAX/AVG must specify a concrete source field."""
     with pytest.raises(ValueError, match="requires a concrete field"):
         AggregateSpec(func="SUM")
+
+    with pytest.raises(ValueError, match="requires a concrete field"):
+        AggregateSpec(func="SUM", field="*")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_aggregates.py` around lines 260 - 263, Add a case to
test_aggregate_spec_requires_field_for_non_count to assert that passing a
wildcard field is rejected: instantiate AggregateSpec(func="SUM", field="*")
inside the same pytest.raises(ValueError, match="requires a concrete field")
block (or an additional similar block) so that SUM with field="*" raises the
same ValueError; reference AggregateSpec and the existing
test_aggregate_spec_requires_field_for_non_count to locate where to add this
assertion.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@tests/test_aggregates.py`:
- Around line 260-263: Add a case to
test_aggregate_spec_requires_field_for_non_count to assert that passing a
wildcard field is rejected: instantiate AggregateSpec(func="SUM", field="*")
inside the same pytest.raises(ValueError, match="requires a concrete field")
block (or an additional similar block) so that SUM with field="*" raises the
same ValueError; reference AggregateSpec and the existing
test_aggregate_spec_requires_field_for_non_count to locate where to add this
assertion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add aggregate/group-by query support to avoid raw SQL for common reporting

2 participants