feat(llma): wire up auto-create hook, generated types, and MCP tools by andrewm4894 · Pull Request #54367 · PostHog/posthog

andrewm4894 · 2026-04-13T22:01:32Z

Problem

With the backend pipeline and frontend in place, new evaluations should automatically get a default report configuration, and MCP tools should expose report operations.

Part 5 of 5 in a stacked PR series. Depends on #54366.

Stack:

Models + API (feat(llma): add evaluation report models and API #54363)
Report Agent (feat(llma): add evaluation report agent #54364)
Delivery + Workflows (feat(llma): add evaluation report delivery and temporal workflows #54365)
Frontend (feat(llma): add evaluation reports frontend #54366)
Wiring + Generated (this PR)

Changes

perform_create hook in EvaluationViewSet auto-creates an EvaluationReport with frequency=every_n, trigger_threshold=100, enabled=True, empty delivery targets
Product-scoped generated API types (api.schemas.ts, api.ts)
MCP tool definitions for evaluation report CRUD and generation

Note: Global generated files (schema.json, schema.py, snapshots, MCP codegen) should be regenerated via hogli build:openapi after merge.

How did you test this code?

The auto-create hook is covered by existing API tests. Generated types are mechanical output from serializer definitions.

No

Docs update

skip-inkeep-docs

🤖 LLM context

Agent-assisted PR stack creation from a single reference branch.

github-actions · 2026-04-13T22:01:41Z

Hey @andrewm4894! 👋\nThis pull request seems to contain no description. Please add useful context, rationale, and/or any other information that will help make sense of this change now and in the distant Mars-based future.

andrewm4894 · 2026-04-13T22:01:42Z

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more

This stack of pull requests is managed by Graphite. Learn more about stacking.

github-actions · 2026-04-13T22:03:06Z

MCP UI Apps size report

App	JS	CSS
debug	545.4 KB	23.6 KB
action	530.1 KB	23.6 KB
action-list	536.2 KB	23.6 KB
cohort	529.2 KB	23.6 KB
cohort-list	535.2 KB	23.6 KB
error-details	538.1 KB	23.6 KB
error-issue	529.8 KB	23.6 KB
error-issue-list	536.1 KB	23.6 KB
experiment	533.5 KB	23.6 KB
experiment-list	536.9 KB	23.6 KB
experiment-results	533.1 KB	23.6 KB
feature-flag	534.0 KB	23.6 KB
feature-flag-list	540.6 KB	23.6 KB
llm-costs	534.9 KB	23.6 KB
survey	530.8 KB	23.6 KB
survey-global-stats	532.4 KB	23.6 KB
survey-list	536.9 KB	23.6 KB
survey-stats	532.4 KB	23.6 KB
workflow	529.6 KB	23.6 KB
workflow-list	535.6 KB	23.6 KB
query-results	543.6 KB	23.6 KB

github-actions · 2026-04-13T22:04:26Z

Size Change: 0 B

Total Size: 129 MB

ℹ️ View Unchanged

Filename	Size
`frontend/dist/368Hedgehogs`	5.26 kB
`frontend/dist/abap`	14.2 kB
`frontend/dist/AccountSocialConnected`	1.62 kB
`frontend/dist/Action`	23.2 kB
`frontend/dist/Actions`	1.02 kB
`frontend/dist/AdvancedActivityLogsScene`	34 kB
`frontend/dist/AgenticAuthorize`	5.25 kB
`frontend/dist/apex`	3.95 kB
`frontend/dist/ApprovalDetail`	16.2 kB
`frontend/dist/array.full.es5.js`	332 kB
`frontend/dist/array.full.js`	427 kB
`frontend/dist/array.js`	183 kB
`frontend/dist/AsyncMigrations`	13.1 kB
`frontend/dist/AuthorizationStatus`	716 B
`frontend/dist/azcli`	846 B
`frontend/dist/bat`	1.84 kB
`frontend/dist/BatchExportScene`	60.3 kB
`frontend/dist/bicep`	2.55 kB
`frontend/dist/Billing`	493 B
`frontend/dist/BillingSection`	20.8 kB
`frontend/dist/BoxPlot`	5.04 kB
`frontend/dist/browserAll-0QZMN1W2`	37.4 kB
`frontend/dist/ButtonPrimitives`	562 B
`frontend/dist/CalendarHeatMap`	4.79 kB
`frontend/dist/cameligo`	2.18 kB
`frontend/dist/changeRequestsLogic`	544 B
`frontend/dist/CLIAuthorize`	11.3 kB
`frontend/dist/CLILive`	3.97 kB
`frontend/dist/clojure`	9.64 kB
`frontend/dist/coffee`	3.59 kB
`frontend/dist/Cohort`	23.2 kB
`frontend/dist/CohortCalculationHistory`	6.22 kB
`frontend/dist/Cohorts`	9.39 kB
`frontend/dist/ConfirmOrganization`	4.48 kB
`frontend/dist/conversations.js`	65.8 kB
`frontend/dist/Coupons`	720 B
`frontend/dist/cpp`	5.3 kB
`frontend/dist/Create`	829 B
`frontend/dist/crisp-chat-integration.js`	1.88 kB
`frontend/dist/csharp`	4.52 kB
`frontend/dist/csp`	1.42 kB
`frontend/dist/css`	4.51 kB
`frontend/dist/cssMode`	4.15 kB
`frontend/dist/CustomCssScene`	3.55 kB
`frontend/dist/CustomerAnalyticsConfigurationScene`	1.99 kB
`frontend/dist/CustomerAnalyticsScene`	26.4 kB
`frontend/dist/CustomerJourneyBuilderScene`	1.69 kB
`frontend/dist/CustomerJourneyTemplatesScene`	7.39 kB
`frontend/dist/customizations.full.js`	17.9 kB
`frontend/dist/CyclotronJobInputAssignee`	1.32 kB
`frontend/dist/CyclotronJobInputTicketTags`	711 B
`frontend/dist/cypher`	3.38 kB
`frontend/dist/dart`	4.25 kB
`frontend/dist/Dashboard`	1.11 kB
`frontend/dist/Dashboards`	23.1 kB
`frontend/dist/DataManagementScene`	646 B
`frontend/dist/DataPipelinesNewScene`	2.28 kB
`frontend/dist/DataWarehouseScene`	1.26 kB
`frontend/dist/DataWarehouseSourceScene`	634 B
`frontend/dist/Deactivated`	1.13 kB
`frontend/dist/dead-clicks-autocapture.js`	13.1 kB
`frontend/dist/DeadLetterQueue`	5.38 kB
`frontend/dist/DebugScene`	20 kB
`frontend/dist/decompressionWorker`	2.85 kB
`frontend/dist/decompressionWorker.js`	2.85 kB
`frontend/dist/DefinitionEdit`	7.11 kB
`frontend/dist/DefinitionView`	22.7 kB
`frontend/dist/DestinationsScene`	2.67 kB
`frontend/dist/dist`	575 B
`frontend/dist/dockerfile`	1.87 kB
`frontend/dist/EarlyAccessFeature`	753 B
`frontend/dist/EarlyAccessFeatures`	2.84 kB
`frontend/dist/ecl`	5.33 kB
`frontend/dist/EditorScene`	896 B
`frontend/dist/elixir`	10.3 kB
`frontend/dist/elk.bundled`	1.44 MB
`frontend/dist/EmailMFAVerify`	2.98 kB
`frontend/dist/EndpointScene`	37.5 kB
`frontend/dist/EndpointsScene`	22.1 kB
`frontend/dist/ErrorTrackingConfigurationScene`	2.2 kB
`frontend/dist/ErrorTrackingIssueFingerprintsScene`	6.98 kB
`frontend/dist/ErrorTrackingIssueScene`	81.7 kB
`frontend/dist/ErrorTrackingScene`	12.9 kB
`frontend/dist/EvaluationTemplates`	575 B
`frontend/dist/EventsScene`	2.46 kB
`frontend/dist/exception-autocapture.js`	11.8 kB
`frontend/dist/Experiment`	210 kB
`frontend/dist/Experiments`	17.7 kB
`frontend/dist/exporter`	20.9 MB
`frontend/dist/exporter.js`	20.9 MB
`frontend/dist/ExportsScene`	3.86 kB
`frontend/dist/FeatureFlag`	128 kB
`frontend/dist/FeatureFlags`	606 B
`frontend/dist/FeatureFlagTemplatesScene`	7.03 kB
`frontend/dist/FlappyHog`	5.78 kB
`frontend/dist/flow9`	1.8 kB
`frontend/dist/freemarker2`	16.7 kB
`frontend/dist/fsharp`	2.98 kB
`frontend/dist/go`	2.65 kB
`frontend/dist/graphql`	2.26 kB
`frontend/dist/Group`	14.4 kB
`frontend/dist/Groups`	3.91 kB
`frontend/dist/GroupsNew`	7.34 kB
`frontend/dist/handlebars`	7.34 kB
`frontend/dist/hcl`	3.59 kB
`frontend/dist/HealthCategoryDetailScene`	7.23 kB
`frontend/dist/HealthScene`	10.3 kB
`frontend/dist/HeatmapNewScene`	4.16 kB
`frontend/dist/HeatmapRecordingScene`	3.92 kB
`frontend/dist/HeatmapScene`	5.88 kB
`frontend/dist/HeatmapsScene`	3.88 kB
`frontend/dist/hls`	394 kB
`frontend/dist/HogFunctionScene`	58.7 kB
`frontend/dist/HogRepl`	7.37 kB
`frontend/dist/html`	5.58 kB
`frontend/dist/htmlMode`	4.62 kB
`frontend/dist/image-blob-reduce.esm`	49.4 kB
`frontend/dist/InboxScene`	59.7 kB
`frontend/dist/index`	311 kB
`frontend/dist/index.js`	311 kB
`frontend/dist/ini`	1.1 kB
`frontend/dist/InsightOptions`	5.41 kB
`frontend/dist/InsightScene`	28.9 kB
`frontend/dist/IntegrationsRedirect`	733 B
`frontend/dist/intercom-integration.js`	1.93 kB
`frontend/dist/InviteSignup`	14.4 kB
`frontend/dist/java`	3.22 kB
`frontend/dist/javascript`	985 B
`frontend/dist/jsonMode`	13.9 kB
`frontend/dist/julia`	7.22 kB
`frontend/dist/kotlin`	3.4 kB
`frontend/dist/lazy`	150 kB
`frontend/dist/LegacyPluginScene`	26.6 kB
`frontend/dist/LemonTextAreaMarkdown`	502 B
`frontend/dist/less`	3.9 kB
`frontend/dist/lexon`	2.44 kB
`frontend/dist/lib`	2.22 kB
`frontend/dist/Link`	468 B
`frontend/dist/LinkScene`	24.8 kB
`frontend/dist/LinksScene`	4.19 kB
`frontend/dist/liquid`	4.53 kB
`frontend/dist/LiveDebugger`	19.1 kB
`frontend/dist/LiveEventsTable`	2.98 kB
`frontend/dist/LLMAnalyticsClusterScene`	15.7 kB
`frontend/dist/LLMAnalyticsClustersScene`	43.1 kB
`frontend/dist/LLMAnalyticsDatasetScene`	19.7 kB
`frontend/dist/LLMAnalyticsDatasetsScene`	3.28 kB
`frontend/dist/LLMAnalyticsEvaluation`	58.7 kB
`frontend/dist/LLMAnalyticsEvaluationsScene`	29.5 kB
`frontend/dist/LLMAnalyticsPlaygroundScene`	36.3 kB
`frontend/dist/LLMAnalyticsScene`	117 kB
`frontend/dist/LLMAnalyticsSessionScene`	13.4 kB
`frontend/dist/LLMAnalyticsTraceScene`	127 kB
`frontend/dist/LLMAnalyticsUsers`	526 B
`frontend/dist/LLMASessionFeedbackDisplay`	4.83 kB
`frontend/dist/LLMPromptScene`	20.6 kB
`frontend/dist/LLMPromptsScene`	4.21 kB
`frontend/dist/Login`	8.57 kB
`frontend/dist/Login2FA`	4.2 kB
`frontend/dist/logs.js`	38.5 kB
`frontend/dist/LogsScene`	11.3 kB
`frontend/dist/lua`	2.11 kB
`frontend/dist/m3`	2.81 kB
`frontend/dist/main`	819 kB
`frontend/dist/ManagedMigration`	14 kB
`frontend/dist/markdown`	3.79 kB
`frontend/dist/MarketingAnalyticsScene`	39.7 kB
`frontend/dist/MaterializedColumns`	10.2 kB
`frontend/dist/Max`	835 B
`frontend/dist/mdx`	5.39 kB
`frontend/dist/memlens.lib.bundle`	27.8 kB
`frontend/dist/MessageTemplate`	16.3 kB
`frontend/dist/MetricsScene`	828 B
`frontend/dist/mips`	2.58 kB
`frontend/dist/ModelsScene`	13.6 kB
`frontend/dist/MonacoDiffEditor`	403 B
`frontend/dist/monacoEditorWorker`	288 kB
`frontend/dist/monacoEditorWorker.js`	288 kB
`frontend/dist/monacoJsonWorker`	419 kB
`frontend/dist/monacoJsonWorker.js`	419 kB
`frontend/dist/monacoTsWorker`	7.02 MB
`frontend/dist/monacoTsWorker.js`	7.02 MB
`frontend/dist/MoveToPostHogCloud`	4.46 kB
`frontend/dist/msdax`	4.91 kB
`frontend/dist/mysql`	11.3 kB
`frontend/dist/NavTabChat`	4.68 kB
`frontend/dist/NewSourceWizard`	724 B
`frontend/dist/NewTabScene`	681 B
`frontend/dist/NodeDetailScene`	16.3 kB
`frontend/dist/NotebookCanvasScene`	3.2 kB
`frontend/dist/NotebookPanel`	5.21 kB
`frontend/dist/NotebookScene`	8.21 kB
`frontend/dist/NotebooksScene`	7.58 kB
`frontend/dist/OAuthAuthorize`	573 B
`frontend/dist/objective-c`	2.41 kB
`frontend/dist/Onboarding`	699 kB
`frontend/dist/OnboardingCouponRedemption`	1.2 kB
`frontend/dist/pascal`	2.99 kB
`frontend/dist/pascaligo`	2 kB
`frontend/dist/passkeyLogic`	484 B
`frontend/dist/PasswordReset`	4.32 kB
`frontend/dist/PasswordResetComplete`	2.94 kB
`frontend/dist/perl`	8.25 kB
`frontend/dist/PersonScene`	16.1 kB
`frontend/dist/PersonsScene`	4.68 kB
`frontend/dist/pgsql`	13.5 kB
`frontend/dist/php`	8.02 kB
`frontend/dist/PipelineStatusScene`	6.22 kB
`frontend/dist/pla`	1.67 kB
`frontend/dist/posthog`	137 kB
`frontend/dist/postiats`	7.86 kB
`frontend/dist/powerquery`	16.9 kB
`frontend/dist/powershell`	3.27 kB
`frontend/dist/PreflightCheck`	5.53 kB
`frontend/dist/product-tours.js`	115 kB
`frontend/dist/ProductTour`	273 kB
`frontend/dist/ProductTours`	4.68 kB
`frontend/dist/ProjectHomepage`	24.7 kB
`frontend/dist/protobuf`	9.05 kB
`frontend/dist/pug`	4.82 kB
`frontend/dist/python`	4.76 kB
`frontend/dist/qsharp`	3.19 kB
`frontend/dist/QueryPerformance`	3.44 kB
`frontend/dist/r`	3.12 kB
`frontend/dist/razor`	9.35 kB
`frontend/dist/recorder-v2.js`	111 kB
`frontend/dist/recorder.js`	111 kB
`frontend/dist/redis`	3.55 kB
`frontend/dist/redshift`	11.8 kB
`frontend/dist/RegionMap`	29.4 kB
`frontend/dist/render-query`	20.6 MB
`frontend/dist/render-query.js`	20.6 MB
`frontend/dist/ResourceTransfer`	9.17 kB
`frontend/dist/restructuredtext`	3.9 kB
`frontend/dist/RevenueAnalyticsScene`	25.6 kB
`frontend/dist/ruby`	8.5 kB
`frontend/dist/rust`	4.16 kB
`frontend/dist/SavedInsights`	664 B
`frontend/dist/sb`	1.82 kB
`frontend/dist/scala`	7.32 kB
`frontend/dist/scheme`	1.76 kB
`frontend/dist/scss`	6.41 kB
`frontend/dist/SdkDoctorScene`	9.4 kB
`frontend/dist/SessionAttributionExplorerScene`	6.62 kB
`frontend/dist/SessionGroupSummariesTable`	4.62 kB
`frontend/dist/SessionGroupSummaryScene`	17 kB
`frontend/dist/SessionProfileScene`	15.8 kB
`frontend/dist/SessionRecordingDetail`	1.73 kB
`frontend/dist/SessionRecordingFilePlaybackScene`	4.46 kB
`frontend/dist/SessionRecordings`	742 B
`frontend/dist/SessionRecordingsKiosk`	8.84 kB
`frontend/dist/SessionRecordingsPlaylistScene`	4.14 kB
`frontend/dist/SessionRecordingsSettingsScene`	1.9 kB
`frontend/dist/SessionsScene`	3.86 kB
`frontend/dist/SettingsScene`	2.98 kB
`frontend/dist/SharedMetric`	4.83 kB
`frontend/dist/SharedMetrics`	549 B
`frontend/dist/shell`	3.07 kB
`frontend/dist/SignupContainer`	24.5 kB
`frontend/dist/Site`	1.18 kB
`frontend/dist/solidity`	18.6 kB
`frontend/dist/sophia`	2.76 kB
`frontend/dist/SourcesScene`	5.96 kB
`frontend/dist/sourceWizardLogic`	662 B
`frontend/dist/sparql`	2.55 kB
`frontend/dist/sql`	10.3 kB
`frontend/dist/SqlVariableEditScene`	7.24 kB
`frontend/dist/st`	7.4 kB
`frontend/dist/StartupProgram`	21.2 kB
`frontend/dist/SubscriptionsScene`	16.4 kB
`frontend/dist/SupportSettingsScene`	1.16 kB
`frontend/dist/SupportTicketScene`	23.4 kB
`frontend/dist/SupportTicketsScene`	733 B
`frontend/dist/Survey`	848 B
`frontend/dist/SurveyFormBuilder`	1.54 kB
`frontend/dist/Surveys`	18.2 kB
`frontend/dist/surveys.js`	90 kB
`frontend/dist/SurveyWizard`	64.2 kB
`frontend/dist/swift`	5.26 kB
`frontend/dist/SystemStatus`	16.8 kB
`frontend/dist/systemverilog`	7.61 kB
`frontend/dist/TaskDetailScene`	21.5 kB
`frontend/dist/TaskTracker`	13.2 kB
`frontend/dist/tcl`	3.57 kB
`frontend/dist/TextCardMarkdownEditor`	11 kB
`frontend/dist/toolbar`	10.7 MB
`frontend/dist/toolbar.js`	10.7 MB
`frontend/dist/ToolbarLaunch`	2.52 kB
`frontend/dist/tracing-headers.js`	1.74 kB
`frontend/dist/TracingScene`	29.4 kB
`frontend/dist/TransformationsScene`	1.91 kB
`frontend/dist/tsMode`	24 kB
`frontend/dist/twig`	5.97 kB
`frontend/dist/TwoFactorReset`	3.98 kB
`frontend/dist/typescript`	240 B
`frontend/dist/typespec`	2.82 kB
`frontend/dist/Unsubscribe`	1.62 kB
`frontend/dist/UserInterview`	4.53 kB
`frontend/dist/UserInterviews`	2.01 kB
`frontend/dist/vb`	5.79 kB
`frontend/dist/VercelConnect`	4.95 kB
`frontend/dist/VercelLinkError`	1.91 kB
`frontend/dist/VerifyEmail`	4.48 kB
`frontend/dist/vimMode`	211 kB
`frontend/dist/VisualReviewRunScene`	18.6 kB
`frontend/dist/VisualReviewRunsScene`	6.16 kB
`frontend/dist/VisualReviewSettingsScene`	10.8 kB
`frontend/dist/web-vitals.js`	6.39 kB
`frontend/dist/WebAnalyticsScene`	5.77 kB
`frontend/dist/WebGLRenderer-DYjOwNoG`	60.3 kB
`frontend/dist/WebGPURenderer-B_wkl_Ja`	36.3 kB
`frontend/dist/WebScriptsScene`	2.54 kB
`frontend/dist/webworkerAll-puPV1rBA`	324 B
`frontend/dist/wgsl`	7.34 kB
`frontend/dist/Wizard`	4.45 kB
`frontend/dist/WorkflowScene`	102 kB
`frontend/dist/WorkflowsScene`	46.9 kB
`frontend/dist/WorldMap`	4.73 kB
`frontend/dist/xml`	2.98 kB
`frontend/dist/yaml`	4.6 kB

_{compressed-size-action}

greptile-apps · 2026-04-13T22:05:16Z

Prompt To Fix All With AI

This is a comment left during a code review.
Path: products/llm_analytics/frontend/generated/api.ts
Line: 548-557

Comment:
**Wrong return type for `runs` endpoint**

The backend's `runs` action returns `EvaluationReportRunSerializer(runs, many=True).data` — a JSON array of run objects — but this generated function types the response as `Promise<EvaluationReportApi>` (a single report config object). Any frontend code calling this will receive an array but TypeScript will treat it as a single `EvaluationReportApi`.

The root cause is that `EvaluationReportViewSet.runs` in `evaluation_reports.py` is missing `@extend_schema(responses=EvaluationReportRunSerializer(many=True))`, so drf-spectacular fell back to the viewset's default serializer class. Adding that decorator and regenerating will produce the correct list type here.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: products/llm_analytics/backend/api/evaluations.py
Line: 235-246

Comment:
**Auto-create and evaluation save are not in the same transaction**

`serializer.save()` commits the evaluation in its own implicit transaction. If `EvaluationReport.objects.create()` then raises (e.g. a DB error or constraint violation), the user receives a 500 but the evaluation row is already persisted. They will likely retry, creating a duplicate evaluation, while the original silently exists without any report.

Wrap both operations in `transaction.atomic()` so a failed report creation rolls back the evaluation too:

```python
from django.db import transaction

def perform_create(self, serializer):
    with transaction.atomic():
        instance = serializer.save()
        from ..models.evaluation_reports import EvaluationReport
        EvaluationReport.objects.create(
            team=self.team,
            evaluation=instance,
            start_date=instance.created_at,
        )
    # tracking/analytics below can remain outside the transaction
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: products/llm_analytics/backend/api/evaluations.py
Line: 240

Comment:
**Deferred import should be at module level**

There's no circular-import risk here — `evaluation_reports.py` only imports from `posthog.models.utils` and standard library modules. Move this to the top of the file alongside the other `..models.*` imports, using the same relative-import style:

```suggestion
        from ..models.evaluation_reports import EvaluationReport
```

How can I resolve this? If you propose a fix, please make it concise.

_{Reviews (1): Last reviewed commit: "feat(llma): wire up auto-create hook, ge..." | Re-trigger Greptile}

greptile-apps · 2026-04-13T22:05:20Z

products/llm_analytics/frontend/generated/api.ts

+export const llmAnalyticsEvaluationReportsRunsRetrieve = async (
+    projectId: string,
+    id: string,
+    options?: RequestInit
+): Promise<EvaluationReportApi> => {
+    return apiMutator<EvaluationReportApi>(getLlmAnalyticsEvaluationReportsRunsRetrieveUrl(projectId, id), {
+        ...options,
+        method: 'GET',
+    })
+}


Wrong return type for runs endpoint

The backend's runs action returns EvaluationReportRunSerializer(runs, many=True).data — a JSON array of run objects — but this generated function types the response as Promise<EvaluationReportApi> (a single report config object). Any frontend code calling this will receive an array but TypeScript will treat it as a single EvaluationReportApi.

The root cause is that EvaluationReportViewSet.runs in evaluation_reports.py is missing @extend_schema(responses=EvaluationReportRunSerializer(many=True)), so drf-spectacular fell back to the viewset's default serializer class. Adding that decorator and regenerating will produce the correct list type here.

Prompt To Fix With AI

This is a comment left during a code review. Path: products/llm_analytics/frontend/generated/api.ts Line: 548-557 Comment: **Wrong return type for `runs` endpoint** The backend's `runs` action returns `EvaluationReportRunSerializer(runs, many=True).data` — a JSON array of run objects — but this generated function types the response as `Promise<EvaluationReportApi>` (a single report config object). Any frontend code calling this will receive an array but TypeScript will treat it as a single `EvaluationReportApi`. The root cause is that `EvaluationReportViewSet.runs` in `evaluation_reports.py` is missing `@extend_schema(responses=EvaluationReportRunSerializer(many=True))`, so drf-spectacular fell back to the viewset's default serializer class. Adding that decorator and regenerating will produce the correct list type here. How can I resolve this? If you propose a fix, please make it concise.

Good catch — the runs action is indeed missing an @extend_schema annotation, so drf-spectacular falls back to the viewset default. Will add @extend_schema(responses=EvaluationReportRunSerializer(many=True)) to the runs action and regenerate the product-scoped types.

greptile-apps · 2026-04-13T22:05:21Z

products/llm_analytics/backend/api/evaluations.py

    def perform_create(self, serializer):
        instance = serializer.save()

+        # Auto-create a default report config so reports are generated from the start.
+        # Users can later add email/Slack delivery targets if they want notifications.
+        from products.llm_analytics.backend.models.evaluation_reports import EvaluationReport
+
+        EvaluationReport.objects.create(
+            team=self.team,
+            evaluation=instance,
+            start_date=instance.created_at,
+        )


Auto-create and evaluation save are not in the same transaction

serializer.save() commits the evaluation in its own implicit transaction. If EvaluationReport.objects.create() then raises (e.g. a DB error or constraint violation), the user receives a 500 but the evaluation row is already persisted. They will likely retry, creating a duplicate evaluation, while the original silently exists without any report.

Wrap both operations in transaction.atomic() so a failed report creation rolls back the evaluation too:

from django.db import transaction def perform_create(self, serializer): with transaction.atomic(): instance = serializer.save() from ..models.evaluation_reports import EvaluationReport EvaluationReport.objects.create( team=self.team, evaluation=instance, start_date=instance.created_at, ) # tracking/analytics below can remain outside the transaction

Prompt To Fix With AI

This is a comment left during a code review. Path: products/llm_analytics/backend/api/evaluations.py Line: 235-246 Comment: **Auto-create and evaluation save are not in the same transaction** `serializer.save()` commits the evaluation in its own implicit transaction. If `EvaluationReport.objects.create()` then raises (e.g. a DB error or constraint violation), the user receives a 500 but the evaluation row is already persisted. They will likely retry, creating a duplicate evaluation, while the original silently exists without any report. Wrap both operations in `transaction.atomic()` so a failed report creation rolls back the evaluation too: ```python from django.db import transaction def perform_create(self, serializer): with transaction.atomic(): instance = serializer.save() from ..models.evaluation_reports import EvaluationReport EvaluationReport.objects.create( team=self.team, evaluation=instance, start_date=instance.created_at, ) # tracking/analytics below can remain outside the transaction ``` How can I resolve this? If you propose a fix, please make it concise.

Valid point. Wrapping both in transaction.atomic() is the right call — if the report creation fails we shouldn't leave an orphaned evaluation. Will fix.

greptile-apps · 2026-04-13T22:05:22Z

products/llm_analytics/backend/api/evaluations.py


+        # Auto-create a default report config so reports are generated from the start.
+        # Users can later add email/Slack delivery targets if they want notifications.
+        from products.llm_analytics.backend.models.evaluation_reports import EvaluationReport


Deferred import should be at module level

There's no circular-import risk here — evaluation_reports.py only imports from posthog.models.utils and standard library modules. Move this to the top of the file alongside the other ..models.* imports, using the same relative-import style:

Suggested change

from products.llm_analytics.backend.models.evaluation_reports import EvaluationReport

from ..models.evaluation_reports import EvaluationReport

Prompt To Fix With AI

This is a comment left during a code review. Path: products/llm_analytics/backend/api/evaluations.py Line: 240 Comment: **Deferred import should be at module level** There's no circular-import risk here — `evaluation_reports.py` only imports from `posthog.models.utils` and standard library modules. Move this to the top of the file alongside the other `..models.*` imports, using the same relative-import style: ```suggestion from ..models.evaluation_reports import EvaluationReport ``` How can I resolve this? If you propose a fix, please make it concise.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Agreed, no circular import risk here. Will move to module-level alongside the other ..models.* imports.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d5be7f452d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-13T22:05:53Z

products/llm_analytics/frontend/generated/api.ts

+): Promise<EvaluationReportApi> => {
+    return apiMutator<EvaluationReportApi>(getLlmAnalyticsEvaluationReportsRunsRetrieveUrl(projectId, id), {


Use run-list response type for runs endpoint

llmAnalyticsEvaluationReportsRunsRetrieve is typed as returning EvaluationReportApi, but the backend runs action actually returns a list of report-run records (EvaluationReportRunSerializer(..., many=True) in products/llm_analytics/backend/api/evaluation_reports.py). This mismatch means frontend callers generated from this SDK will treat an array payload like a single object, which can lead to incorrect UI logic or runtime undefined field access once this endpoint is used.

Useful? React with 👍 / 👎.

Same issue as the greptile comment above — missing @extend_schema(responses=EvaluationReportRunSerializer(many=True)) on the runs action. Will fix together with that one.

Auto-create an EvaluationReport when an evaluation is created via perform_create hook. Add product-scoped generated API types and MCP tool definitions for evaluation reports. Note: global generated files (schema.json, schema.py, snapshots, MCP codegen) should be regenerated via `hogli build:openapi` after merge rather than cherry-picked.

This was referenced Apr 13, 2026

feat(llma): add evaluation report models and API #54363

Open

feat(llma): add evaluation report agent #54364

Open

andrewm4894 mentioned this pull request Apr 13, 2026

feat(llma): add evaluation report delivery and temporal workflows #54365

Open

andrewm4894 mentioned this pull request Apr 13, 2026

feat(llma): add evaluation reports frontend #54366

Open

andrewm4894 self-assigned this Apr 13, 2026

greptile-apps bot reviewed Apr 13, 2026

View reviewed changes

chatgpt-codex-connector bot reviewed Apr 13, 2026

View reviewed changes

andrewm4894 force-pushed the andy/llma-eval-reports-4-frontend branch from 2eb779d to da19b28 Compare April 13, 2026 22:27

andrewm4894 force-pushed the andy/llma-eval-reports-5-wiring-generated branch 2 times, most recently from 2f4ca4d to 6243131 Compare April 13, 2026 22:34

andrewm4894 force-pushed the andy/llma-eval-reports-4-frontend branch from da19b28 to 5859546 Compare April 13, 2026 22:34

andrewm4894 force-pushed the andy/llma-eval-reports-5-wiring-generated branch from 6243131 to fd3fff7 Compare April 13, 2026 22:39

andrewm4894 force-pushed the andy/llma-eval-reports-4-frontend branch 2 times, most recently from 409b90c to d8a5ccf Compare April 13, 2026 22:46

andrewm4894 force-pushed the andy/llma-eval-reports-5-wiring-generated branch from fd3fff7 to 2ee84b8 Compare April 13, 2026 22:46

andrewm4894 force-pushed the andy/llma-eval-reports-5-wiring-generated branch from 2ee84b8 to ef36efc Compare April 13, 2026 22:48

andrewm4894 force-pushed the andy/llma-eval-reports-4-frontend branch from d8a5ccf to cb89a1b Compare April 13, 2026 22:48

	from products.llm_analytics.backend.models.evaluation_reports import EvaluationReport
	from ..models.evaluation_reports import EvaluationReport

		): Promise<EvaluationReportApi> => {
		return apiMutator<EvaluationReportApi>(getLlmAnalyticsEvaluationReportsRunsRetrieveUrl(projectId, id), {

Conversation

andrewm4894 commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Changes

How did you test this code?

Docs update

🤖 LLM context

Uh oh!

github-actions bot commented Apr 13, 2026

Uh oh!

andrewm4894 commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Apr 13, 2026

MCP UI Apps size report

Uh oh!

github-actions bot commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

greptile-apps bot commented Apr 13, 2026

Uh oh!

greptile-apps bot Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

andrewm4894 Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

andrewm4894 Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

andrewm4894 Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

andrewm4894 Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

andrewm4894 commented Apr 13, 2026 •

edited

Loading

andrewm4894 commented Apr 13, 2026 •

edited

Loading

github-actions bot commented Apr 13, 2026 •

edited

Loading