Skip to content

feat(llma): wire up auto-create hook, generated types, and MCP tools#54367

Open
andrewm4894 wants to merge 1 commit intoandy/llma-eval-reports-4-frontendfrom
andy/llma-eval-reports-5-wiring-generated
Open

feat(llma): wire up auto-create hook, generated types, and MCP tools#54367
andrewm4894 wants to merge 1 commit intoandy/llma-eval-reports-4-frontendfrom
andy/llma-eval-reports-5-wiring-generated

Conversation

@andrewm4894
Copy link
Copy Markdown
Member

@andrewm4894 andrewm4894 commented Apr 13, 2026

Problem

With the backend pipeline and frontend in place, new evaluations should automatically get a default report configuration, and MCP tools should expose report operations.

Part 5 of 5 in a stacked PR series. Depends on #54366.

Stack:

  1. Models + API (feat(llma): add evaluation report models and API #54363)
  2. Report Agent (feat(llma): add evaluation report agent #54364)
  3. Delivery + Workflows (feat(llma): add evaluation report delivery and temporal workflows #54365)
  4. Frontend (feat(llma): add evaluation reports frontend #54366)
  5. Wiring + Generated (this PR)

Changes

  • perform_create hook in EvaluationViewSet auto-creates an EvaluationReport with frequency=every_n, trigger_threshold=100, enabled=True, empty delivery targets
  • Product-scoped generated API types (api.schemas.ts, api.ts)
  • MCP tool definitions for evaluation report CRUD and generation

Note: Global generated files (schema.json, schema.py, snapshots, MCP codegen) should be regenerated via hogli build:openapi after merge.

How did you test this code?

The auto-create hook is covered by existing API tests. Generated types are mechanical output from serializer definitions.

No

Docs update

skip-inkeep-docs

🤖 LLM context

Agent-assisted PR stack creation from a single reference branch.

@github-actions
Copy link
Copy Markdown
Contributor

Hey @andrewm4894! 👋\nThis pull request seems to contain no description. Please add useful context, rationale, and/or any other information that will help make sense of this change now and in the distant Mars-based future.

Copy link
Copy Markdown
Member Author

andrewm4894 commented Apr 13, 2026

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more

This stack of pull requests is managed by Graphite. Learn more about stacking.

@github-actions
Copy link
Copy Markdown
Contributor

MCP UI Apps size report

App JS CSS
debug 545.4 KB 23.6 KB
action 530.1 KB 23.6 KB
action-list 536.2 KB 23.6 KB
cohort 529.2 KB 23.6 KB
cohort-list 535.2 KB 23.6 KB
error-details 538.1 KB 23.6 KB
error-issue 529.8 KB 23.6 KB
error-issue-list 536.1 KB 23.6 KB
experiment 533.5 KB 23.6 KB
experiment-list 536.9 KB 23.6 KB
experiment-results 533.1 KB 23.6 KB
feature-flag 534.0 KB 23.6 KB
feature-flag-list 540.6 KB 23.6 KB
llm-costs 534.9 KB 23.6 KB
survey 530.8 KB 23.6 KB
survey-global-stats 532.4 KB 23.6 KB
survey-list 536.9 KB 23.6 KB
survey-stats 532.4 KB 23.6 KB
workflow 529.6 KB 23.6 KB
workflow-list 535.6 KB 23.6 KB
query-results 543.6 KB 23.6 KB

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 13, 2026

Size Change: 0 B

Total Size: 129 MB

ℹ️ View Unchanged
Filename Size
frontend/dist/368Hedgehogs 5.26 kB
frontend/dist/abap 14.2 kB
frontend/dist/AccountSocialConnected 1.62 kB
frontend/dist/Action 23.2 kB
frontend/dist/Actions 1.02 kB
frontend/dist/AdvancedActivityLogsScene 34 kB
frontend/dist/AgenticAuthorize 5.25 kB
frontend/dist/apex 3.95 kB
frontend/dist/ApprovalDetail 16.2 kB
frontend/dist/array.full.es5.js 332 kB
frontend/dist/array.full.js 427 kB
frontend/dist/array.js 183 kB
frontend/dist/AsyncMigrations 13.1 kB
frontend/dist/AuthorizationStatus 716 B
frontend/dist/azcli 846 B
frontend/dist/bat 1.84 kB
frontend/dist/BatchExportScene 60.3 kB
frontend/dist/bicep 2.55 kB
frontend/dist/Billing 493 B
frontend/dist/BillingSection 20.8 kB
frontend/dist/BoxPlot 5.04 kB
frontend/dist/browserAll-0QZMN1W2 37.4 kB
frontend/dist/ButtonPrimitives 562 B
frontend/dist/CalendarHeatMap 4.79 kB
frontend/dist/cameligo 2.18 kB
frontend/dist/changeRequestsLogic 544 B
frontend/dist/CLIAuthorize 11.3 kB
frontend/dist/CLILive 3.97 kB
frontend/dist/clojure 9.64 kB
frontend/dist/coffee 3.59 kB
frontend/dist/Cohort 23.2 kB
frontend/dist/CohortCalculationHistory 6.22 kB
frontend/dist/Cohorts 9.39 kB
frontend/dist/ConfirmOrganization 4.48 kB
frontend/dist/conversations.js 65.8 kB
frontend/dist/Coupons 720 B
frontend/dist/cpp 5.3 kB
frontend/dist/Create 829 B
frontend/dist/crisp-chat-integration.js 1.88 kB
frontend/dist/csharp 4.52 kB
frontend/dist/csp 1.42 kB
frontend/dist/css 4.51 kB
frontend/dist/cssMode 4.15 kB
frontend/dist/CustomCssScene 3.55 kB
frontend/dist/CustomerAnalyticsConfigurationScene 1.99 kB
frontend/dist/CustomerAnalyticsScene 26.4 kB
frontend/dist/CustomerJourneyBuilderScene 1.69 kB
frontend/dist/CustomerJourneyTemplatesScene 7.39 kB
frontend/dist/customizations.full.js 17.9 kB
frontend/dist/CyclotronJobInputAssignee 1.32 kB
frontend/dist/CyclotronJobInputTicketTags 711 B
frontend/dist/cypher 3.38 kB
frontend/dist/dart 4.25 kB
frontend/dist/Dashboard 1.11 kB
frontend/dist/Dashboards 23.1 kB
frontend/dist/DataManagementScene 646 B
frontend/dist/DataPipelinesNewScene 2.28 kB
frontend/dist/DataWarehouseScene 1.26 kB
frontend/dist/DataWarehouseSourceScene 634 B
frontend/dist/Deactivated 1.13 kB
frontend/dist/dead-clicks-autocapture.js 13.1 kB
frontend/dist/DeadLetterQueue 5.38 kB
frontend/dist/DebugScene 20 kB
frontend/dist/decompressionWorker 2.85 kB
frontend/dist/decompressionWorker.js 2.85 kB
frontend/dist/DefinitionEdit 7.11 kB
frontend/dist/DefinitionView 22.7 kB
frontend/dist/DestinationsScene 2.67 kB
frontend/dist/dist 575 B
frontend/dist/dockerfile 1.87 kB
frontend/dist/EarlyAccessFeature 753 B
frontend/dist/EarlyAccessFeatures 2.84 kB
frontend/dist/ecl 5.33 kB
frontend/dist/EditorScene 896 B
frontend/dist/elixir 10.3 kB
frontend/dist/elk.bundled 1.44 MB
frontend/dist/EmailMFAVerify 2.98 kB
frontend/dist/EndpointScene 37.5 kB
frontend/dist/EndpointsScene 22.1 kB
frontend/dist/ErrorTrackingConfigurationScene 2.2 kB
frontend/dist/ErrorTrackingIssueFingerprintsScene 6.98 kB
frontend/dist/ErrorTrackingIssueScene 81.7 kB
frontend/dist/ErrorTrackingScene 12.9 kB
frontend/dist/EvaluationTemplates 575 B
frontend/dist/EventsScene 2.46 kB
frontend/dist/exception-autocapture.js 11.8 kB
frontend/dist/Experiment 210 kB
frontend/dist/Experiments 17.7 kB
frontend/dist/exporter 20.9 MB
frontend/dist/exporter.js 20.9 MB
frontend/dist/ExportsScene 3.86 kB
frontend/dist/FeatureFlag 128 kB
frontend/dist/FeatureFlags 606 B
frontend/dist/FeatureFlagTemplatesScene 7.03 kB
frontend/dist/FlappyHog 5.78 kB
frontend/dist/flow9 1.8 kB
frontend/dist/freemarker2 16.7 kB
frontend/dist/fsharp 2.98 kB
frontend/dist/go 2.65 kB
frontend/dist/graphql 2.26 kB
frontend/dist/Group 14.4 kB
frontend/dist/Groups 3.91 kB
frontend/dist/GroupsNew 7.34 kB
frontend/dist/handlebars 7.34 kB
frontend/dist/hcl 3.59 kB
frontend/dist/HealthCategoryDetailScene 7.23 kB
frontend/dist/HealthScene 10.3 kB
frontend/dist/HeatmapNewScene 4.16 kB
frontend/dist/HeatmapRecordingScene 3.92 kB
frontend/dist/HeatmapScene 5.88 kB
frontend/dist/HeatmapsScene 3.88 kB
frontend/dist/hls 394 kB
frontend/dist/HogFunctionScene 58.7 kB
frontend/dist/HogRepl 7.37 kB
frontend/dist/html 5.58 kB
frontend/dist/htmlMode 4.62 kB
frontend/dist/image-blob-reduce.esm 49.4 kB
frontend/dist/InboxScene 59.7 kB
frontend/dist/index 311 kB
frontend/dist/index.js 311 kB
frontend/dist/ini 1.1 kB
frontend/dist/InsightOptions 5.41 kB
frontend/dist/InsightScene 28.9 kB
frontend/dist/IntegrationsRedirect 733 B
frontend/dist/intercom-integration.js 1.93 kB
frontend/dist/InviteSignup 14.4 kB
frontend/dist/java 3.22 kB
frontend/dist/javascript 985 B
frontend/dist/jsonMode 13.9 kB
frontend/dist/julia 7.22 kB
frontend/dist/kotlin 3.4 kB
frontend/dist/lazy 150 kB
frontend/dist/LegacyPluginScene 26.6 kB
frontend/dist/LemonTextAreaMarkdown 502 B
frontend/dist/less 3.9 kB
frontend/dist/lexon 2.44 kB
frontend/dist/lib 2.22 kB
frontend/dist/Link 468 B
frontend/dist/LinkScene 24.8 kB
frontend/dist/LinksScene 4.19 kB
frontend/dist/liquid 4.53 kB
frontend/dist/LiveDebugger 19.1 kB
frontend/dist/LiveEventsTable 2.98 kB
frontend/dist/LLMAnalyticsClusterScene 15.7 kB
frontend/dist/LLMAnalyticsClustersScene 43.1 kB
frontend/dist/LLMAnalyticsDatasetScene 19.7 kB
frontend/dist/LLMAnalyticsDatasetsScene 3.28 kB
frontend/dist/LLMAnalyticsEvaluation 58.7 kB
frontend/dist/LLMAnalyticsEvaluationsScene 29.5 kB
frontend/dist/LLMAnalyticsPlaygroundScene 36.3 kB
frontend/dist/LLMAnalyticsScene 117 kB
frontend/dist/LLMAnalyticsSessionScene 13.4 kB
frontend/dist/LLMAnalyticsTraceScene 127 kB
frontend/dist/LLMAnalyticsUsers 526 B
frontend/dist/LLMASessionFeedbackDisplay 4.83 kB
frontend/dist/LLMPromptScene 20.6 kB
frontend/dist/LLMPromptsScene 4.21 kB
frontend/dist/Login 8.57 kB
frontend/dist/Login2FA 4.2 kB
frontend/dist/logs.js 38.5 kB
frontend/dist/LogsScene 11.3 kB
frontend/dist/lua 2.11 kB
frontend/dist/m3 2.81 kB
frontend/dist/main 819 kB
frontend/dist/ManagedMigration 14 kB
frontend/dist/markdown 3.79 kB
frontend/dist/MarketingAnalyticsScene 39.7 kB
frontend/dist/MaterializedColumns 10.2 kB
frontend/dist/Max 835 B
frontend/dist/mdx 5.39 kB
frontend/dist/memlens.lib.bundle 27.8 kB
frontend/dist/MessageTemplate 16.3 kB
frontend/dist/MetricsScene 828 B
frontend/dist/mips 2.58 kB
frontend/dist/ModelsScene 13.6 kB
frontend/dist/MonacoDiffEditor 403 B
frontend/dist/monacoEditorWorker 288 kB
frontend/dist/monacoEditorWorker.js 288 kB
frontend/dist/monacoJsonWorker 419 kB
frontend/dist/monacoJsonWorker.js 419 kB
frontend/dist/monacoTsWorker 7.02 MB
frontend/dist/monacoTsWorker.js 7.02 MB
frontend/dist/MoveToPostHogCloud 4.46 kB
frontend/dist/msdax 4.91 kB
frontend/dist/mysql 11.3 kB
frontend/dist/NavTabChat 4.68 kB
frontend/dist/NewSourceWizard 724 B
frontend/dist/NewTabScene 681 B
frontend/dist/NodeDetailScene 16.3 kB
frontend/dist/NotebookCanvasScene 3.2 kB
frontend/dist/NotebookPanel 5.21 kB
frontend/dist/NotebookScene 8.21 kB
frontend/dist/NotebooksScene 7.58 kB
frontend/dist/OAuthAuthorize 573 B
frontend/dist/objective-c 2.41 kB
frontend/dist/Onboarding 699 kB
frontend/dist/OnboardingCouponRedemption 1.2 kB
frontend/dist/pascal 2.99 kB
frontend/dist/pascaligo 2 kB
frontend/dist/passkeyLogic 484 B
frontend/dist/PasswordReset 4.32 kB
frontend/dist/PasswordResetComplete 2.94 kB
frontend/dist/perl 8.25 kB
frontend/dist/PersonScene 16.1 kB
frontend/dist/PersonsScene 4.68 kB
frontend/dist/pgsql 13.5 kB
frontend/dist/php 8.02 kB
frontend/dist/PipelineStatusScene 6.22 kB
frontend/dist/pla 1.67 kB
frontend/dist/posthog 137 kB
frontend/dist/postiats 7.86 kB
frontend/dist/powerquery 16.9 kB
frontend/dist/powershell 3.27 kB
frontend/dist/PreflightCheck 5.53 kB
frontend/dist/product-tours.js 115 kB
frontend/dist/ProductTour 273 kB
frontend/dist/ProductTours 4.68 kB
frontend/dist/ProjectHomepage 24.7 kB
frontend/dist/protobuf 9.05 kB
frontend/dist/pug 4.82 kB
frontend/dist/python 4.76 kB
frontend/dist/qsharp 3.19 kB
frontend/dist/QueryPerformance 3.44 kB
frontend/dist/r 3.12 kB
frontend/dist/razor 9.35 kB
frontend/dist/recorder-v2.js 111 kB
frontend/dist/recorder.js 111 kB
frontend/dist/redis 3.55 kB
frontend/dist/redshift 11.8 kB
frontend/dist/RegionMap 29.4 kB
frontend/dist/render-query 20.6 MB
frontend/dist/render-query.js 20.6 MB
frontend/dist/ResourceTransfer 9.17 kB
frontend/dist/restructuredtext 3.9 kB
frontend/dist/RevenueAnalyticsScene 25.6 kB
frontend/dist/ruby 8.5 kB
frontend/dist/rust 4.16 kB
frontend/dist/SavedInsights 664 B
frontend/dist/sb 1.82 kB
frontend/dist/scala 7.32 kB
frontend/dist/scheme 1.76 kB
frontend/dist/scss 6.41 kB
frontend/dist/SdkDoctorScene 9.4 kB
frontend/dist/SessionAttributionExplorerScene 6.62 kB
frontend/dist/SessionGroupSummariesTable 4.62 kB
frontend/dist/SessionGroupSummaryScene 17 kB
frontend/dist/SessionProfileScene 15.8 kB
frontend/dist/SessionRecordingDetail 1.73 kB
frontend/dist/SessionRecordingFilePlaybackScene 4.46 kB
frontend/dist/SessionRecordings 742 B
frontend/dist/SessionRecordingsKiosk 8.84 kB
frontend/dist/SessionRecordingsPlaylistScene 4.14 kB
frontend/dist/SessionRecordingsSettingsScene 1.9 kB
frontend/dist/SessionsScene 3.86 kB
frontend/dist/SettingsScene 2.98 kB
frontend/dist/SharedMetric 4.83 kB
frontend/dist/SharedMetrics 549 B
frontend/dist/shell 3.07 kB
frontend/dist/SignupContainer 24.5 kB
frontend/dist/Site 1.18 kB
frontend/dist/solidity 18.6 kB
frontend/dist/sophia 2.76 kB
frontend/dist/SourcesScene 5.96 kB
frontend/dist/sourceWizardLogic 662 B
frontend/dist/sparql 2.55 kB
frontend/dist/sql 10.3 kB
frontend/dist/SqlVariableEditScene 7.24 kB
frontend/dist/st 7.4 kB
frontend/dist/StartupProgram 21.2 kB
frontend/dist/SubscriptionsScene 16.4 kB
frontend/dist/SupportSettingsScene 1.16 kB
frontend/dist/SupportTicketScene 23.4 kB
frontend/dist/SupportTicketsScene 733 B
frontend/dist/Survey 848 B
frontend/dist/SurveyFormBuilder 1.54 kB
frontend/dist/Surveys 18.2 kB
frontend/dist/surveys.js 90 kB
frontend/dist/SurveyWizard 64.2 kB
frontend/dist/swift 5.26 kB
frontend/dist/SystemStatus 16.8 kB
frontend/dist/systemverilog 7.61 kB
frontend/dist/TaskDetailScene 21.5 kB
frontend/dist/TaskTracker 13.2 kB
frontend/dist/tcl 3.57 kB
frontend/dist/TextCardMarkdownEditor 11 kB
frontend/dist/toolbar 10.7 MB
frontend/dist/toolbar.js 10.7 MB
frontend/dist/ToolbarLaunch 2.52 kB
frontend/dist/tracing-headers.js 1.74 kB
frontend/dist/TracingScene 29.4 kB
frontend/dist/TransformationsScene 1.91 kB
frontend/dist/tsMode 24 kB
frontend/dist/twig 5.97 kB
frontend/dist/TwoFactorReset 3.98 kB
frontend/dist/typescript 240 B
frontend/dist/typespec 2.82 kB
frontend/dist/Unsubscribe 1.62 kB
frontend/dist/UserInterview 4.53 kB
frontend/dist/UserInterviews 2.01 kB
frontend/dist/vb 5.79 kB
frontend/dist/VercelConnect 4.95 kB
frontend/dist/VercelLinkError 1.91 kB
frontend/dist/VerifyEmail 4.48 kB
frontend/dist/vimMode 211 kB
frontend/dist/VisualReviewRunScene 18.6 kB
frontend/dist/VisualReviewRunsScene 6.16 kB
frontend/dist/VisualReviewSettingsScene 10.8 kB
frontend/dist/web-vitals.js 6.39 kB
frontend/dist/WebAnalyticsScene 5.77 kB
frontend/dist/WebGLRenderer-DYjOwNoG 60.3 kB
frontend/dist/WebGPURenderer-B_wkl_Ja 36.3 kB
frontend/dist/WebScriptsScene 2.54 kB
frontend/dist/webworkerAll-puPV1rBA 324 B
frontend/dist/wgsl 7.34 kB
frontend/dist/Wizard 4.45 kB
frontend/dist/WorkflowScene 102 kB
frontend/dist/WorkflowsScene 46.9 kB
frontend/dist/WorldMap 4.73 kB
frontend/dist/xml 2.98 kB
frontend/dist/yaml 4.6 kB

compressed-size-action

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Apr 13, 2026

Prompt To Fix All With AI
This is a comment left during a code review.
Path: products/llm_analytics/frontend/generated/api.ts
Line: 548-557

Comment:
**Wrong return type for `runs` endpoint**

The backend's `runs` action returns `EvaluationReportRunSerializer(runs, many=True).data` — a JSON array of run objects — but this generated function types the response as `Promise<EvaluationReportApi>` (a single report config object). Any frontend code calling this will receive an array but TypeScript will treat it as a single `EvaluationReportApi`.

The root cause is that `EvaluationReportViewSet.runs` in `evaluation_reports.py` is missing `@extend_schema(responses=EvaluationReportRunSerializer(many=True))`, so drf-spectacular fell back to the viewset's default serializer class. Adding that decorator and regenerating will produce the correct list type here.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: products/llm_analytics/backend/api/evaluations.py
Line: 235-246

Comment:
**Auto-create and evaluation save are not in the same transaction**

`serializer.save()` commits the evaluation in its own implicit transaction. If `EvaluationReport.objects.create()` then raises (e.g. a DB error or constraint violation), the user receives a 500 but the evaluation row is already persisted. They will likely retry, creating a duplicate evaluation, while the original silently exists without any report.

Wrap both operations in `transaction.atomic()` so a failed report creation rolls back the evaluation too:

```python
from django.db import transaction

def perform_create(self, serializer):
    with transaction.atomic():
        instance = serializer.save()
        from ..models.evaluation_reports import EvaluationReport
        EvaluationReport.objects.create(
            team=self.team,
            evaluation=instance,
            start_date=instance.created_at,
        )
    # tracking/analytics below can remain outside the transaction
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: products/llm_analytics/backend/api/evaluations.py
Line: 240

Comment:
**Deferred import should be at module level**

There's no circular-import risk here — `evaluation_reports.py` only imports from `posthog.models.utils` and standard library modules. Move this to the top of the file alongside the other `..models.*` imports, using the same relative-import style:

```suggestion
        from ..models.evaluation_reports import EvaluationReport
```

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "feat(llma): wire up auto-create hook, ge..." | Re-trigger Greptile

Comment on lines +548 to +557
export const llmAnalyticsEvaluationReportsRunsRetrieve = async (
projectId: string,
id: string,
options?: RequestInit
): Promise<EvaluationReportApi> => {
return apiMutator<EvaluationReportApi>(getLlmAnalyticsEvaluationReportsRunsRetrieveUrl(projectId, id), {
...options,
method: 'GET',
})
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Wrong return type for runs endpoint

The backend's runs action returns EvaluationReportRunSerializer(runs, many=True).data — a JSON array of run objects — but this generated function types the response as Promise<EvaluationReportApi> (a single report config object). Any frontend code calling this will receive an array but TypeScript will treat it as a single EvaluationReportApi.

The root cause is that EvaluationReportViewSet.runs in evaluation_reports.py is missing @extend_schema(responses=EvaluationReportRunSerializer(many=True)), so drf-spectacular fell back to the viewset's default serializer class. Adding that decorator and regenerating will produce the correct list type here.

Prompt To Fix With AI
This is a comment left during a code review.
Path: products/llm_analytics/frontend/generated/api.ts
Line: 548-557

Comment:
**Wrong return type for `runs` endpoint**

The backend's `runs` action returns `EvaluationReportRunSerializer(runs, many=True).data` — a JSON array of run objects — but this generated function types the response as `Promise<EvaluationReportApi>` (a single report config object). Any frontend code calling this will receive an array but TypeScript will treat it as a single `EvaluationReportApi`.

The root cause is that `EvaluationReportViewSet.runs` in `evaluation_reports.py` is missing `@extend_schema(responses=EvaluationReportRunSerializer(many=True))`, so drf-spectacular fell back to the viewset's default serializer class. Adding that decorator and regenerating will produce the correct list type here.

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — the runs action is indeed missing an @extend_schema annotation, so drf-spectacular falls back to the viewset default. Will add @extend_schema(responses=EvaluationReportRunSerializer(many=True)) to the runs action and regenerate the product-scoped types.

Comment on lines 235 to +246
def perform_create(self, serializer):
instance = serializer.save()

# Auto-create a default report config so reports are generated from the start.
# Users can later add email/Slack delivery targets if they want notifications.
from products.llm_analytics.backend.models.evaluation_reports import EvaluationReport

EvaluationReport.objects.create(
team=self.team,
evaluation=instance,
start_date=instance.created_at,
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Auto-create and evaluation save are not in the same transaction

serializer.save() commits the evaluation in its own implicit transaction. If EvaluationReport.objects.create() then raises (e.g. a DB error or constraint violation), the user receives a 500 but the evaluation row is already persisted. They will likely retry, creating a duplicate evaluation, while the original silently exists without any report.

Wrap both operations in transaction.atomic() so a failed report creation rolls back the evaluation too:

from django.db import transaction

def perform_create(self, serializer):
    with transaction.atomic():
        instance = serializer.save()
        from ..models.evaluation_reports import EvaluationReport
        EvaluationReport.objects.create(
            team=self.team,
            evaluation=instance,
            start_date=instance.created_at,
        )
    # tracking/analytics below can remain outside the transaction
Prompt To Fix With AI
This is a comment left during a code review.
Path: products/llm_analytics/backend/api/evaluations.py
Line: 235-246

Comment:
**Auto-create and evaluation save are not in the same transaction**

`serializer.save()` commits the evaluation in its own implicit transaction. If `EvaluationReport.objects.create()` then raises (e.g. a DB error or constraint violation), the user receives a 500 but the evaluation row is already persisted. They will likely retry, creating a duplicate evaluation, while the original silently exists without any report.

Wrap both operations in `transaction.atomic()` so a failed report creation rolls back the evaluation too:

```python
from django.db import transaction

def perform_create(self, serializer):
    with transaction.atomic():
        instance = serializer.save()
        from ..models.evaluation_reports import EvaluationReport
        EvaluationReport.objects.create(
            team=self.team,
            evaluation=instance,
            start_date=instance.created_at,
        )
    # tracking/analytics below can remain outside the transaction
```

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valid point. Wrapping both in transaction.atomic() is the right call — if the report creation fails we shouldn't leave an orphaned evaluation. Will fix.


# Auto-create a default report config so reports are generated from the start.
# Users can later add email/Slack delivery targets if they want notifications.
from products.llm_analytics.backend.models.evaluation_reports import EvaluationReport
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Deferred import should be at module level

There's no circular-import risk here — evaluation_reports.py only imports from posthog.models.utils and standard library modules. Move this to the top of the file alongside the other ..models.* imports, using the same relative-import style:

Suggested change
from products.llm_analytics.backend.models.evaluation_reports import EvaluationReport
from ..models.evaluation_reports import EvaluationReport
Prompt To Fix With AI
This is a comment left during a code review.
Path: products/llm_analytics/backend/api/evaluations.py
Line: 240

Comment:
**Deferred import should be at module level**

There's no circular-import risk here — `evaluation_reports.py` only imports from `posthog.models.utils` and standard library modules. Move this to the top of the file alongside the other `..models.*` imports, using the same relative-import style:

```suggestion
        from ..models.evaluation_reports import EvaluationReport
```

How can I resolve this? If you propose a fix, please make it concise.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, no circular import risk here. Will move to module-level alongside the other ..models.* imports.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d5be7f452d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +552 to +553
): Promise<EvaluationReportApi> => {
return apiMutator<EvaluationReportApi>(getLlmAnalyticsEvaluationReportsRunsRetrieveUrl(projectId, id), {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Use run-list response type for runs endpoint

llmAnalyticsEvaluationReportsRunsRetrieve is typed as returning EvaluationReportApi, but the backend runs action actually returns a list of report-run records (EvaluationReportRunSerializer(..., many=True) in products/llm_analytics/backend/api/evaluation_reports.py). This mismatch means frontend callers generated from this SDK will treat an array payload like a single object, which can lead to incorrect UI logic or runtime undefined field access once this endpoint is used.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue as the greptile comment above — missing @extend_schema(responses=EvaluationReportRunSerializer(many=True)) on the runs action. Will fix together with that one.

@andrewm4894 andrewm4894 force-pushed the andy/llma-eval-reports-4-frontend branch from 2eb779d to da19b28 Compare April 13, 2026 22:27
@andrewm4894 andrewm4894 force-pushed the andy/llma-eval-reports-5-wiring-generated branch 2 times, most recently from 2f4ca4d to 6243131 Compare April 13, 2026 22:34
@andrewm4894 andrewm4894 force-pushed the andy/llma-eval-reports-4-frontend branch from da19b28 to 5859546 Compare April 13, 2026 22:34
@andrewm4894 andrewm4894 force-pushed the andy/llma-eval-reports-5-wiring-generated branch from 6243131 to fd3fff7 Compare April 13, 2026 22:39
@andrewm4894 andrewm4894 force-pushed the andy/llma-eval-reports-4-frontend branch 2 times, most recently from 409b90c to d8a5ccf Compare April 13, 2026 22:46
@andrewm4894 andrewm4894 force-pushed the andy/llma-eval-reports-5-wiring-generated branch from fd3fff7 to 2ee84b8 Compare April 13, 2026 22:46
Auto-create an EvaluationReport when an evaluation is created via
perform_create hook. Add product-scoped generated API types and
MCP tool definitions for evaluation reports.

Note: global generated files (schema.json, schema.py, snapshots,
MCP codegen) should be regenerated via `hogli build:openapi` after
merge rather than cherry-picked.
@andrewm4894 andrewm4894 force-pushed the andy/llma-eval-reports-5-wiring-generated branch from 2ee84b8 to ef36efc Compare April 13, 2026 22:48
@andrewm4894 andrewm4894 force-pushed the andy/llma-eval-reports-4-frontend branch from d8a5ccf to cb89a1b Compare April 13, 2026 22:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant