Add conversation summarization warehouse + nightly batch (tiering Fase 3)#103
Merged
Conversation
Adds the concrete operational design agreed in the 2026-06-29 architecture discussion: the per-turn flow (one synchronous Postgres transaction that persists the turn and enqueues the outbox; fail-open hot state; async off-box backup via the worker) and the handoff flow (the support_case + handoff.requested in the same sync transaction is the consistency gate; async external delivery). Clarifies that Redis only ever enters as a non-authoritative hot-state backend / read cache (levels 1-2), never as the durability anchor. Makes the plan self-contained. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
First code slice of the layered-persistence plan: a stable, swappable hot-tier session-state seam mirroring ConversationArchiveSink. Adds app/conversations/session_state.py (SessionState, SessionStateStore Protocol, InMemorySessionStateStore with TTL, build_session_state_store_from_env), the SESSION_STATE_BACKEND / SESSION_STATE_TTL_SECONDS config, and wires ChatFlowService to write the state fail-open via an answer() wrapper (keyed by hash_session, never a raw session_id). The store is process-wide in app.state and injected by the /chat and /web routes only under PERSISTENCE_BACKEND=postgres. Default no-op: with no store injected the hot path is byte-for-byte unchanged. The state is non-authoritative (truth stays in the Postgres write-through). The reader, the Redis backend (Nível 1), and migrating the transport escape state come next. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…e 3)
Adds the warehouse table (migration 012_conversation_summaries with UNIQUE per
conversation for idempotency) and the nightly summarization batch: a testable core
(app/conversations/summary.py) that redacts every turn BEFORE the model, asks a
cheap model for a structured {problem, solution, status} record, and upserts it
idempotently; plus the operational script scripts/summarize_conversations.py
(eligible = inactive >= N hours, >= min turns, not yet summarized; --dry-run never
calls the model; refuses to write without ENABLE_CONVERSATION_SUMMARY). customer_ref
prefers customer_id, falls back to session_hash; never a raw session_id/phone.
Dark by default (ENABLE_CONVERSATION_SUMMARY=false; the table is empty until the
batch runs). Covered by unit tests (incl. PAN redacted before the model) and a
gated real-Postgres integration test wired into phase0-gates. Pending: systemd
timer + runbook, RAG consumption (Fase 4, behind eval), cost metric.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
O que foi feito
Fatia 5 do plano de persistência em camadas: o warehouse de resumos + o batch noturno de sumarização, dark por default.
Mudanças
012_conversation_summaries.sql: tabela comUNIQUE(domain, conversation_key)(idempotência) + CHECK de status.app/conversations/summary.py(núcleo testável):build_transcriptredige PAN/PII antes do modelo, prompt,parse_summary_jsonrobusto,run_summary_batchidempotente (upsert).scripts/summarize_conversations.py: batch operacional. Elegível = conversa inativa (--inactivity-hours),≥ --min-turns, ainda não resumida.--dry-runnão chama o modelo; recusa escrever semENABLE_CONVERSATION_SUMMARY=true.customer_ref=customer_idsenãosession_hash(nunca telefone/session_idcru).ENABLE_CONVERSATION_SUMMARY(default false). Tech-plan Fase 3 marcada.Garantias
Validação
python -m pytest→ 618 passed, 33 skipped (+9 unit)python -m compileall app scripts tests✓phase0-gates.yml.Pendente (próximas fatias)
Agendamento (systemd timer + runbook), consumo no RAG (Fase 4, atrás de eval), métrica de custo.
🤖 Generated with Claude Code