Architecture

Boundaries

src/lib/memory/
- types.ts: thread, event, context contracts
- memory-provider.ts: backend-agnostic memory interface
- cortex-http-provider.ts: CortexLTM API implementation (UI does not write SQL)
src/lib/llm/
- llm-provider.ts: model streaming interface
- default-llm-provider.ts: OpenAI/Groq streaming provider used for demo/local mode, with soul-contract system injection
src/lib/server/
- providers.ts: provider selection + singleton lifecycle
- user-id.ts: stable user ID resolver shim
- http.ts: shared API error payload helper
src/app/api/chat/
- threads/route.ts: list/create thread endpoints
- [threadId]/route.ts: rename/delete thread endpoints
- [threadId]/messages/route.ts: message read + ordered write/stream endpoint
- [threadId]/messages/[messageId]/reaction/route.ts: assistant message reaction write endpoint
- [threadId]/promote/route.ts: promote thread to core-memory endpoint
- [threadId]/summary/route.ts: optional summary fetch endpoint
src/components/chat/
- chat-shell.tsx: page-level composition
- message-list.tsx: scrolling transcript + typing indicator
- message-item.tsx: bubble rendering, assistant animation, and reaction tray interactions
- composer.tsx: input/send UX
- typing-indicator.tsx: in-flight visual
src/hooks/use-chat.ts
- client state machine: thread bootstrap, optimistic add, stream consume, errors

Memory backend: implement MemoryProvider, then switch selection in getMemoryProvider.
Model provider: backend-owned in CortexLTM when CHAT_DEMO_MODE=false.
UI composition: keep message contracts stable (UIMessage) and replace components independently.

Validate payload.
If CHAT_DEMO_MODE=true, stream local demo output.
Proxy chat requests directly to CortexLTM /v1/threads/{threadId}/chat.
CortexLTM performs ordered writes/context/model call:
- persist user event (source: chatui)
- build context (summary cues + semantic cues + short-term events)
- generate assistant response
- persist assistant event (source: chatui_llm)

Validate reaction (thumbs_up, heart, angry, sad, brain) or clear (null).
Proxy write to CortexLTM reaction endpoint for the target assistant event.
UI applies optimistic updates to message meta.reaction and reconciles with server response.
Selecting brain triggers force_update_summary(thread_id) in CortexLTM.

CortexUI now preserves upstream CortexLTM HTTP status/error details for thread/message routes.
This avoids masking backend failures as generic 503 responses, making operational debugging faster.

src/components/chat/message-item.tsx parses streamed assistant content defensively.
Code-fence parsing handles:
- optional list markers before opening fences
- trailing text after closing fences
- missing closing fences by flushing buffered code as a code block
Goal: keep transcript rendering and copy behavior stable under imperfect streamed model output.