feat(llma): add evaluation report models and API#54363
Open
andrewm4894 wants to merge 5 commits intomasterfrom
Open
feat(llma): add evaluation report models and API#54363andrewm4894 wants to merge 5 commits intomasterfrom
andrewm4894 wants to merge 5 commits intomasterfrom
Conversation
Contributor
|
Hey @andrewm4894! 👋\nThis pull request seems to contain no description. Please add useful context, rationale, and/or any other information that will help make sense of this change now and in the distant Mars-based future. |
Member
Author
Contributor
Prompt To Fix All With AIThis is a comment left during a code review.
Path: products/llm_analytics/backend/api/evaluation_reports.py
Line: 80-93
Comment:
**Empty `delivery_targets` list passes validation**
`validate_delivery_targets` only validates the structure of each item but never checks that at least one target is present. An empty list silently returns, causing `test_create_requires_delivery_targets` to receive a `201` instead of the expected `400`.
```suggestion
def validate_delivery_targets(self, value: list) -> list:
if not isinstance(value, list):
raise serializers.ValidationError("Delivery targets must be a list.")
if not value:
raise serializers.ValidationError("At least one delivery target is required.")
for target in value:
if not isinstance(target, dict):
raise serializers.ValidationError("Each delivery target must be an object.")
target_type = target.get("type")
if target_type not in ("email", "slack"):
raise serializers.ValidationError(f"Invalid delivery target type: {target_type}")
if target_type == "email" and not target.get("value"):
raise serializers.ValidationError("Email delivery target must include a 'value' field.")
if target_type == "slack" and (not target.get("integration_id") or not target.get("channel")):
raise serializers.ValidationError("Slack delivery target must include 'integration_id' and 'channel'.")
return value
```
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: products/llm_analytics/backend/models/evaluation_reports.py
Line: 103-115
Comment:
**`rrule` raises `KeyError` for `EVERY_N` frequency**
`freq_map` covers only `HOURLY`, `DAILY`, and `WEEKLY`. Accessing `freq_map[self.frequency]` on an `EVERY_N` report raises `KeyError`. `set_next_delivery_date` guards against this call, but the property itself has no guard — any future caller (scheduler code, admin tooling) would get an unhandled exception.
```suggestion
@property
def rrule(self):
if self.is_count_triggered:
raise ValueError("rrule is not available for count-triggered reports (frequency='every_n').")
freq_map: dict[str, int] = {
self.Frequency.HOURLY: HOURLY,
self.Frequency.DAILY: DAILY,
self.Frequency.WEEKLY: WEEKLY,
}
freq = cast(Literal[0, 1, 2, 3, 4, 5, 6], freq_map[self.frequency])
return rrule(
freq=freq,
dtstart=self.start_date,
byweekday=_to_rrule_weekdays(self.byweekday) if self.byweekday else None,
)
```
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: products/llm_analytics/backend/api/evaluation_reports.py
Line: 179-187
Comment:
**`asyncio.run()` inside a sync view fails under ASGI**
`asyncio.run()` creates a new event loop, but if this view is dispatched under an ASGI server (uvicorn/daphne) there is already a running event loop and the call raises `RuntimeError: This event loop is already running`. The same Temporal workflow kick-off pattern elsewhere in PostHog uses a thread-based workaround or `sync_to_async`. Consider checking how other `sync_connect` callers in the codebase handle this.
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: products/llm_analytics/backend/migrations/0026_alter_evaluationreport_frequency_and_more.py
Line: 1-36
Comment:
**Five separate migrations for a single feature**
Migrations 0023–0027 all relate to the same `EvaluationReport` table introduced in this PR. Before merging, squash them into a single migration so the history is clean and the table is created in its final state in one shot. This avoids unnecessary intermediate `AlterField` operations on a table that has no rows yet.
How can I resolve this? If you propose a fix, please make it concise.Reviews (1): Last reviewed commit: "feat(llma): add evaluation report models..." | Re-trigger Greptile |
products/llm_analytics/backend/migrations/0026_alter_evaluationreport_frequency_and_more.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e80736bfb9
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Add EvaluationReport and EvaluationReportRun Django models with migrations, a DRF viewset with CRUD + manual generation trigger, and comprehensive tests. This provides the data layer for scheduled evaluation reports that will be wired to a Temporal workflow in a follow-up PR.
- Add guard for rrule property on count-triggered reports to prevent KeyError when frequency is 'every_n' - Replace asyncio.run() with async_to_sync() for Temporal workflow start to avoid RuntimeError under ASGI servers - Update test to reflect design intent: empty delivery_targets are allowed (reports are viewable in UI without delivery configured)
Truthiness check treated 0 as absent, falling back to the instance value and bypassing min-threshold validation on PATCH.
Consolidate migrations 0023-0027 into a single 0023 that creates both tables in their final state. No intermediate AlterField ops needed since the tables have no existing rows.
Same bug as trigger_threshold — attrs.get with default fallback ignored the instance's current value during PATCH, validating against the default instead.
7b3cefa to
628c96b
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Problem
LLM Analytics needs scheduled evaluation reports — an LLM agent periodically analyzes evaluation results and delivers a detailed, example-grounded report. This PR adds the data layer foundation.
Part 1 of 5 in a stacked PR series. See reference PR #53400 for full context.
Stack:
Changes
EvaluationReportandEvaluationReportRunDjango models with 5 migrationsHow did you test this code?
No
Docs update
skip-inkeep-docs
🤖 LLM context
Agent-assisted PR stack creation from a single reference branch. Code was written iteratively on
andy/llma-eval-reportsand split into this stack for review.