feat: phrase memory foundation + semantic search workspace shell (#7) by nikazzio · Pull Request #205 · nikazzio/glossa

nikazzio · 2026-06-02T18:38:04Z

Cosa introduce questa PR

Phrase Memory — sistema di memoria semantica per le traduzioni. L'app ricorda le frasi già tradotte, le indicizza con embedding vettoriali e le suggerisce automaticamente durante la traduzione per garantire coerenza terminologica tra sessioni e documenti.

La PR raccoglie 4 piani di sviluppo implementati su branch dedicati e integrati qui.

Piano 1 — Fondamenta DB + Workspace

Integrazione sqlite-vec via rusqlite per ricerca vettoriale locale
Schema DB: workspaces, phrase_memory, phrase_memory_presets, source_phrase_embeddings, historical_techniques, technique_tags
Workspace service CRUD + store Zustand + active workspace
4 preset built-in (Moderno, Medievale IT, Latino, Legale)
Workspace come boundary reale di phrase memory e corpus semantico

Piano 2 — Embedding search + Shell gating

Servizi TS: embedding generation, phrase split (regex/LLM/none), ricerca semantica per coseno, save-on-lock
Comandi Tauri per embedding/search/save phrase memory
Store Zustand per match e job status
Shell gating: workspace home → editor solo con progetto reale aperto
Workspace default creato automaticamente all'init DB (no wizard al primo avvio)

Piano 3 — Tab Memoria + Injection nel prompt

Tab "Memoria" nell'InsightsDrawer: lista match per chunk, toggle abilitazione singolo match
ExtractTermDialog: suggerimento automatico del termine da aggiungere al glossario
Injection frase memoria nel prompt di traduzione (Map-based, race-safe)
Prelaunch warning se chunk con match disabilitati

Piano 4 — Preset Management UI + Pipeline Config

PhraseMemoryOverrides + 3 campi su PipelineConfig (usePhraseMemory, phraseMemoryPresetId, phraseMemoryOverrides)
3 colonne aggiunte a pipelines via ALTER TABLE idempotente
updateCustomPreset + clonePreset su phraseMemoryPresetService
Componente PresetForm: crea/modifica preset custom (splitter, soglia, maxResults, minPhraseLength)
Componente PhraseMemoryPresetManager: lista preset built-in (sola lettura, clonabili) e custom (edit/delete)
Componente PhraseMemoryConfig: toggle + dropdown preset + sezione Avanzate collassabile con override per-pipeline
Tab "Phrase Memory" in SettingsModal per gestione globale preset del workspace
Sezione "Phrase Memory" in fondo alla tab Settings della pipeline

Note architetturali

I preset sono asset del workspace attivo, non globali cross-workspace
Gli override pipeline sovrascrivono il preset selezionato senza modificarlo
L'ordine dei blocchi nel system prompt (static → blob → stage-instructions) rimane invariato per preservare il prefix caching
Nessun nuovo store Zustand: la selezione preset viaggia dentro PipelineConfig (già in pipelineStore)

Test plan

Scope fuori da questa PR

UX multi-workspace: switcher, rename/delete, lista avanzata
Discovery / ingest / OCR / library-centric workflows
Phrase memory export/import nel backup workspace

Copilot

Pull request overview

This PR lays the groundwork for the “phrase memory” feature by introducing a new SQLite schema (workspaces + presets + embedding-related tables), wiring a first-run Workspace Wizard in the React app, and adding a Rust-side sqlite-vec auto-extension registration with a vec_ping Tauri command.

Changes:

Adds Phrase Memory–related TypeScript domain types and services (workspaces + phrase memory presets), plus initial Zustand workspace store.
Extends DB initialization to create the new tables, add projects.workspace_id, seed built-in presets, and store active_workspace_id in app_settings.
Adds a first-run WorkspaceWizard gate in App.tsx and introduces Rust sqlite-vec integration + vec_ping command.

Reviewed changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 25 comments.

Show a summary per file

File	Description
src/types.ts	Adds Phrase Memory types (workspaces, presets, config enums).
src/stores/workspaceStore.ts	New Zustand store for workspace list/active workspace + loading state.
src/services/workspaceService.ts	New workspace CRUD + active-workspace setting helpers.
src/services/workspaceService.test.ts	Unit tests for `workspaceService`.
src/services/phraseMemoryPresetService.ts	Built-in preset seeding + preset CRUD/listing.
src/services/phraseMemoryPresetService.test.ts	Unit tests for preset service.
src/services/dbService.ts	Adds Phrase Memory schema creation, `projects.workspace_id` migration, `active_workspace_id` seeding, and preset seeding.
src/components/workspace/WorkspaceWizard.tsx	First-run wizard UI to create a workspace and select embedding model.
src/App.tsx	Adds workspace guard/wizard gating.
src-tauri/src/vector/mod.rs	Registers sqlite-vec auto-extension and adds `vec_ping` command.
src-tauri/src/lib.rs	Wires vector module init + exposes `vec_ping` command.
src-tauri/Cargo.toml	Adds `rusqlite` (bundled) and `sqlite-vec` dependencies.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

…tabase (#7)

…olons, catch specifico, workspaceStore try/finally (#7)

- vec_upsert_source_phrase: workspace_id → project_id (schema mismatch) - vec_save_locked_phrases: add project_id, source_language, target_language; fix DELETE to use project_id; fix saved counter via rows-changed - vec_search_phrase_memory: CTE elimina doppio vec_distance_cosine; propaga errori row invece di filter_map(ok) - split_phrases_llm: aggiunge controllo HTTP status come get_embeddings - phraseMemoryService: pre-filtra frasi per minPhraseLength prima dell'embed; aggiunge projectId, sourceLanguage, targetLanguage a SaveLockedPhrasesOptions - DocumentView: passa projectId e lingue a saveLockedPhrases - workspaceStore: auto-seleziona primo workspace se active_id non trovato - App: gestisce promise rejection di loadWorkspaces

…ing (#206)

) - phraseMemoryStore: nuovi tipi PhraseMemoryMatch/ChunkPhraseMatches, Map-based state, toggleMatchEnabled, setEnabledMatchIds, conversione distance→score - uiStore: aggiunge 'memory' a ChunkDrawerTab - usePhraseMemoryMatches: hook per match + selezione per chunk - buildMemoryInjection: funzione pura per blocco stage-instructions - checkAllChunksHaveEnabledMatches: check pre-lancio pipeline - MemoryTab: lista match con checkbox, Applica (clipboard), Rielabora - ExtractTermDialog: dialog LLM term suggestion + inserimento glossario - InsightsDrawer: tab Memoria, badge match nell'IndexTab - usePipeline: rerunChunkWithMemory (injection temporanea stage prompt), warning pre-lancio per match tutti disabilitati - glossaryService: addGlossaryEntry wrapper - llmService: extractTermFromPhrase stub (TODO: piano 4 Tauri command) - i18n: chiavi memory.*, document.insightsTabMemory, glossary.*, common.optional Co-authored-by: nikazzio <nikazzio@users.noreply.github.com>

…#228) * fix(phrase-memory): address Piano 3 PR review feedback - ExtractTermDialog: granular Zustand selectors instead of full config + fix generateId prefix 'gle' - usePipeline: restore original prompts by stage ID after phrase memory injection (Map-based, race-safe) - glossaryService: direct SQL INSERT/ON CONFLICT instead of upsertGlossaryEntries - phraseMemoryInjection: JSON.stringify source/target phrases for correct escaping - phraseMemoryStore: new Set(ids) instead of raw array in enabledMatchIds - InsightsDrawer: remove unused matchesByChunk selector, use i18n key for match badge - i18n: add memory.matchBadge key (en + it) * feat(phrase-memory): Piano 4 — Preset Management UI + Pipeline Config Types: - Add PhraseMemoryOverrides to types.ts - Extend PipelineConfig with usePhraseMemory, phraseMemoryPresetId, phraseMemoryOverrides DB: - ALTER TABLE pipelines: add use_phrase_memory, phrase_memory_preset_id, phrase_memory_overrides columns Services: - phraseMemoryPresetService: add updateCustomPreset + clonePreset - pipelineService: DbPipeline + rowToPipelineConfig + savePipelineConfig + saveFullState + duplicatePipeline now persist phrase memory fields - pipelineService.test: phrase memory persistence tests Components: - PresetForm: form crea/modifica preset custom (splitter, threshold, maxResults, minPhraseLength) - PhraseMemoryPresetManager: lista preset built-in/custom con clone/edit/delete - PhraseMemoryConfig: sezione pipeline con toggle, dropdown preset, avanzate collassabili Integration: - SettingsModal: tab "Phrase Memory" con PhraseMemoryPresetManager - SettingsTabPanel: PhraseMemoryConfig in fondo alla tab Settings - PipelineConfig: passa phrase memory props a SettingsTabPanel * fix(phrase-memory): address PR #228 review comments - PhraseMemoryConfig: sync phraseMemoryPresetId to presets[0] when toggle enabled with null presetId - SettingsModal: move hardcoded strings to i18n (phraseMemoryTab, phraseMemoryPresetsTitle, phraseMemoryPresetsHint) - en.json/it.json: add matchBadge_one/matchBadge_other plural keys - en.json/it.json: add settings.phraseMemory* i18n keys --------- Co-authored-by: nikazzio <nikazzio@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 58 out of 59 changed files in this pull request and generated 8 comments.

+        let rows = conn
+            .execute(
+                "INSERT OR IGNORE INTO phrase_memory \
+                 (id, workspace_id, source_phrase, target_phrase, \
+                  source_language, target_language, embedding, created_at) \
+                 VALUES (lower(hex(randomblob(16))), ?1, ?2, ?3, ?4, ?5, ?6, datetime('now'))",
+                rusqlite::params![
+                    workspace_id,
+                    pair.source_phrase,
+                    pair.target_phrase,
+                    source_language,
+                    target_language,
+                    floats_to_blob(&pair.source_embedding)
+                ],
+            )


+  try {
+    await conn.execute(
+      `ALTER TABLE projects ADD COLUMN workspace_id TEXT REFERENCES workspaces(id)`
+    );
+  } catch (err) {
+    const msg = err instanceof Error ? err.message : String(err);
+    if (!msg.includes('duplicate column') && !msg.includes('already exists')) throw err;
+  }


+export function checkAllChunksHaveEnabledMatches(
+  matchesByChunk: Map<string, ChunkPhraseMatches>,
+): string[] {
+  const blocked: string[] = [];
+  for (const [chunkId, data] of matchesByChunk) {
+    if (data.matches.length > 0 && data.enabledMatchIds.size === 0) {
+      blocked.push(chunkId);


…pread, embedding prompt - backupService: add use_phrase_memory/preset_id/overrides to pipelines whitelist - en.json / it.json: remove duplicate phraseMemory* keys in settings section - PhraseMemoryConfig: fix spread on nullable overrides (overrides ?? {}) - embedding.rs split_phrases_llm: align prompt with json_object response_format ({"phrases":[...]})

…ection, workspace scoping Audit PR #205 fixes (issues 1-5): - fix: runSingleChunk/executePipelineForChunk/runJudgeForChunk read config from store at invocation time — memory patch in rerunChunkWithMemory now reaches prompt - fix: handleLockToggle guards use_phrase_memory flag; resolves splitter/minPhraseLength from active preset + overrides instead of hardcoded values - fix: phrase_memory_presets workspace-scoped via ALTER TABLE migration; listPresets/createCustomPreset/deleteCustomPreset/updateCustomPreset/clonePreset all require workspaceId; PhraseMemoryPresetManager + PhraseMemoryConfig updated - fix: split_phrases_llm parser simplified — removes unreachable as_array/sentences branches since response_format:json_object guarantees an object Phrase memory flow wiring (issue 3): - feat: searchPhraseMemoryBatch — one fetchEmbeddings API call for all chunks, then N local SQLite vec_search_phrase_memory queries; no N×API overhead - feat: saveAllCompletedPhrases — bulk variant of saveLockedPhrases with progress cb - feat: usePhraseMemoryAutoSearch — background hook triggered on project open and when use_phrase_memory toggled; populates matchesByChunk store automatically - feat: useSaveToMemory — explicit bulk "save to memory" action with progress; filters translationLocked/completed chunks; resolves preset+overrides config - feat: executePipelineForChunk accepts memoryBlock param; getChunkMemoryBlock reads enabled matches from store; runPipeline/runSingleChunk/runDryRun inject automatically - feat: MemoryTab shows "Salva in memoria" button (cold start + footer); progress inline - feat: handleLockToggle triggers runSearch after saveLockedPhrases for incremental refresh - i18n: saveToMemoryButton/savedToMemory/saveToMemoryFailed keys (it + en)

…e-memory pipeline fixes Dashboard redesign: - Replace flat area cards with unified tab panel (Traduzioni/Biblioteca/Trascrizioni) - Active tab shares background with content below — visual continuity - 2px accent bottom indicator on active tab; disabled tabs at 40% opacity - WorkspaceSettingsModal: new modal with 3 tabs (Generale, Phrase Memory, Backup) mirroring SettingsModal pattern (EditorialModalShell + AnimatePresence + useFocusTrap) - SettingsModal: rename "Impostazioni" tab → "Traduzioni", add disabled Library/ Transcriptions tabs, remove Backup section (moved to WorkspaceSettingsModal) - Apply UI polish: text-wrap balance/pretty, tabular-nums on metrics, concentric border-radius (28→20→16px), transition-colors duration-150 throughout Phrase Memory pipeline: - Fix search pipeline lifecycle and workspace scoping - Fix memory injection into prompt context - Fix embedding search + workspace shell gating - Add usePhraseMemoryAutoSearch and useSaveToMemory hooks with tests - Add phraseMemoryService and workspaceService tests i18n: add workspace.settings.{eyebrow,generalTab,memoryTab,backupTab} (en + it) docs: update ARCHITECTURE.md with workspace/phrase-memory store and component map

…sivo

nikazzio added 8 commits June 2, 2026 20:23

feat: sqlite-vec spike + WAL mode (#7)

6d4026e

feat: add workspace, phrase_memory, historical_techniques schema (#7)

96c172b

feat: Workspace + PhraseMemoryPreset types (#7)

d11df27

feat: workspace service CRUD + active workspace (#7)

f809750

feat: workspace Zustand store (#7)

4f164dd

feat: phrase memory preset service + 4 preset built-in (#7)

cca4f43

feat: workspace creation wizard UI (#7)

6d0be30

feat: workspace guard — wizard al primo avvio (#7)

6b7521a

nikazzio requested a review from Copilot June 2, 2026 18:39

Copilot started reviewing on behalf of nikazzio June 2, 2026 18:40 View session

docs: aggiungi branching strategy nei piani 2-3-4 phrase memory

2ea86c7

Copilot AI reviewed Jun 2, 2026

View reviewed changes

nikazzio added 4 commits June 2, 2026 20:54

fix: export getDb from dbService (#7)

a060a56

fix: rimuovi workspace guard/wizard, crea workspace default in initDa…

8e9eaec

…tabase (#7)

fix: address PR review — use exported helpers, $1 placeholders, semic…

3efa7e1

…olons, catch specifico, workspaceStore try/finally (#7)

test(dbService): add migration coverage for phrase memory schema

0c4d51d

feat(phrase-memory): Piano 2 — embedding search + workspace shell gat…

7333287

…ing (#206)

nikazzio changed the title ~~feat: phrase memory — piano 1: DB foundation + workspace (#7)~~ feat: phrase memory foundation + semantic search workspace shell (#7) Jun 2, 2026

nikazzio and others added 2 commits June 3, 2026 09:56

nikazzio requested a review from Copilot June 3, 2026 08:48

Copilot started reviewing on behalf of nikazzio June 3, 2026 08:48 View session

Copilot AI reviewed Jun 3, 2026

View reviewed changes

nikazzio added 4 commits June 3, 2026 11:09

docs: navigation redesign spec — workspace-as-libreria + editor immer…

be0551d

…sivo

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: phrase memory foundation + semantic search workspace shell (#7)#205

feat: phrase memory foundation + semantic search workspace shell (#7)#205
nikazzio wants to merge 20 commits into
mainfrom
feat/phrase-memory

nikazzio commented Jun 2, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

nikazzio commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Cosa introduce questa PR

Piano 1 — Fondamenta DB + Workspace

Piano 2 — Embedding search + Shell gating

Piano 3 — Tab Memoria + Injection nel prompt

Piano 4 — Preset Management UI + Pipeline Config

Note architetturali

Test plan

Scope fuori da questa PR

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

nikazzio commented Jun 2, 2026 •

edited

Loading