Add conversation summarization warehouse + nightly batch (tiering Fase 3) by renanfulas · Pull Request #103 · renanfulas/supportFAQagent

renanfulas · 2026-06-29T18:45:58Z

O que foi feito

Fatia 5 do plano de persistência em camadas: o warehouse de resumos + o batch noturno de sumarização, dark por default.

Mudanças

Migration 012_conversation_summaries.sql: tabela com UNIQUE(domain, conversation_key) (idempotência) + CHECK de status.
app/conversations/summary.py (núcleo testável): build_transcript redige PAN/PII antes do modelo, prompt, parse_summary_json robusto, run_summary_batch idempotente (upsert).
scripts/summarize_conversations.py: batch operacional. Elegível = conversa inativa (--inactivity-hours), ≥ --min-turns, ainda não resumida. --dry-run não chama o modelo; recusa escrever sem ENABLE_CONVERSATION_SUMMARY=true.
customer_ref = customer_id senão session_hash (nunca telefone/session_id cru).
Config ENABLE_CONVERSATION_SUMMARY (default false). Tech-plan Fase 3 marcada.

Garantias

Dark by default: a tabela fica vazia até o batch rodar com a flag; nada no hot path.
Idempotente por conversa (re-rodar sobrescreve o mesmo registro).
PII/PAN redigido antes de ir ao modelo (testado).

Validação

python -m pytest → 618 passed, 33 skipped (+9 unit)
python -m compileall app scripts tests ✓
Integração real-Postgres (write + idempotência + skip-trivial) na gate phase0-gates.yml.

Pendente (próximas fatias)

Agendamento (systemd timer + runbook), consumo no RAG (Fase 4, atrás de eval), métrica de custo.

🤖 Generated with Claude Code

Adds the concrete operational design agreed in the 2026-06-29 architecture discussion: the per-turn flow (one synchronous Postgres transaction that persists the turn and enqueues the outbox; fail-open hot state; async off-box backup via the worker) and the handoff flow (the support_case + handoff.requested in the same sync transaction is the consistency gate; async external delivery). Clarifies that Redis only ever enters as a non-authoritative hot-state backend / read cache (levels 1-2), never as the durability anchor. Makes the plan self-contained. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

First code slice of the layered-persistence plan: a stable, swappable hot-tier session-state seam mirroring ConversationArchiveSink. Adds app/conversations/session_state.py (SessionState, SessionStateStore Protocol, InMemorySessionStateStore with TTL, build_session_state_store_from_env), the SESSION_STATE_BACKEND / SESSION_STATE_TTL_SECONDS config, and wires ChatFlowService to write the state fail-open via an answer() wrapper (keyed by hash_session, never a raw session_id). The store is process-wide in app.state and injected by the /chat and /web routes only under PERSISTENCE_BACKEND=postgres. Default no-op: with no store injected the hot path is byte-for-byte unchanged. The state is non-authoritative (truth stays in the Postgres write-through). The reader, the Redis backend (Nível 1), and migrating the transport escape state come next. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…e 3) Adds the warehouse table (migration 012_conversation_summaries with UNIQUE per conversation for idempotency) and the nightly summarization batch: a testable core (app/conversations/summary.py) that redacts every turn BEFORE the model, asks a cheap model for a structured {problem, solution, status} record, and upserts it idempotently; plus the operational script scripts/summarize_conversations.py (eligible = inactive >= N hours, >= min turns, not yet summarized; --dry-run never calls the model; refuses to write without ENABLE_CONVERSATION_SUMMARY). customer_ref prefers customer_id, falls back to session_hash; never a raw session_id/phone. Dark by default (ENABLE_CONVERSATION_SUMMARY=false; the table is empty until the batch runs). Covered by unit tests (incl. PAN redacted before the model) and a gated real-Postgres integration test wired into phase0-gates. Pending: systemd timer + runbook, RAG consumption (Fase 4, behind eval), cost metric. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

renanfulas and others added 5 commits June 29, 2026 15:21

Fix integration seed: messages.redaction_version is required post-007

364f54f

Scope summary integration assertions to the seeded conversation_key

c75fb35

renanfulas merged commit 699fcad into main Jun 29, 2026
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add conversation summarization warehouse + nightly batch (tiering Fase 3)#103

Add conversation summarization warehouse + nightly batch (tiering Fase 3)#103
renanfulas merged 5 commits into
mainfrom
codex/conversation-summary-warehouse

renanfulas commented Jun 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant