Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
59 changes: 59 additions & 0 deletions .cursor/rules/notebook-schema-types.mdc
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
---
description: Notebook schema types — which type family to use, and where the migration-ladder boundary lives.
globs:
alwaysApply: true
---

# Notebook schema types: layer boundary

The on-disk shape of `.codex` and `.source` notebooks is owned by
[src/projectManager/utils/schema/](mdc:src/projectManager/utils/schema/index.ts) and
brought up to `CURRENT_SCHEMA_VERSION` by `bringNotebookToCurrent()`. There
are **two type families** and the migration ladder is the only function
that turns one into the other.

## Pre-ladder — `SchemaNotebook` and friends

Use the structural types from [src/projectManager/utils/schema/index.ts](mdc:src/projectManager/utils/schema/index.ts):
`SchemaNotebook`, `SchemaCell`, `SchemaCellMetadata`, `SchemaEdit`.

They describe **anything that might be on disk** (v0/v1/legacy
shapes), so they're permissive: optional fields, `unknown` values,
`[key: string]: unknown` index signatures. Use them **only** when:

- Writing a migration step in [src/projectManager/utils/schema/migrations/](mdc:src/projectManager/utils/schema/migrations/).
- Parsing raw JSON immediately before calling `bringNotebookToCurrent`.
- Synthesizing "old-shaped" fixtures in tests.

The merge resolver is the canonical example:

```typescript
const ourNotebook: SchemaNotebook = JSON.parse(ourContent);
await bringNotebookToCurrent(ourNotebook, { author });
// After the ladder returns, narrow to canonical types:
const ourCells = (ourNotebook.cells ?? []) as unknown as CustomNotebookCellData[];
```

## Post-ladder / runtime — canonical types

Everywhere else — editor at runtime, webviews, serializer downstream of
save, `CodexCellDocument`, message handlers, indexers, exporters — use
the canonical types from [types/index.d.ts](mdc:types/index.d.ts):
`CodexNotebookAsJSONData`, `CustomNotebookCellData`,
`CustomNotebookMetadata`, `EditHistory`.

If you find yourself reaching for `SchemaNotebook` outside the migration
boundary, you almost certainly want a canonical type instead. Webviews
never see un-normalized data; the extension host is the boundary.

Don't re-export the schema types from `types/index.d.ts` — keeping them
co-located in `schema/` keeps the boundary obvious.

## Adding a new schema version

See [src/projectManager/utils/schema/README.md](mdc:src/projectManager/utils/schema/README.md)
for the full recipe. Short version: drop `v<N>_to_v<N+1>.ts` into
`schema/migrations/`, register it in the `migrations` map in
`schema/index.ts`, bump `CURRENT_SCHEMA_VERSION`, and extend
[src/test/suite/schemaLadder.test.ts](mdc:src/test/suite/schemaLadder.test.ts).
The ladder runs automatically at activation, save, merge, and post-sync.
2 changes: 2 additions & 0 deletions src/extension.ts
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ import {
migration_verseRangeLabelsAndPositions,
migration_cellIdsToUuid,
migration_recoverTempFilesAndMergeDuplicates,
migration_normalizeAllNotebooksToCurrentSchema,
} from "./projectManager/utils/migrationUtils";
import { createIndexWithContext } from "./activationHelpers/contextAware/contentIndexes/indexes";
import { StatusBarItem } from "vscode";
Expand Down Expand Up @@ -899,6 +900,7 @@ export async function activate(context: vscode.ExtensionContext) {
await migration_addGlobalReferences(context);
await migration_cellIdsToUuid(context);
await migration_recoverTempFilesAndMergeDuplicates(context);
await migration_normalizeAllNotebooksToCurrentSchema(context);
}

// Remove leftover files from features that have been removed
Expand Down
40 changes: 40 additions & 0 deletions src/projectManager/syncManager.ts
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ import { checkRemoteUpdatingRequired } from "../utils/remoteUpdatingManager";
import { markPendingUpdateRequired, clearPendingUpdate, readLocalProjectSettings } from "../utils/localProjectSettings";
import { isDatabaseReady } from "../utils/sqliteDatabaseFactory";
import { isOnline } from "../utils/connectivityChecker";
import { bringNotebookToCurrentForFile } from "./utils/schema/file";

const DEBUG_SYNC_MANAGER = false;

Expand Down Expand Up @@ -1226,6 +1227,45 @@ export class SyncManager {
this.currentSyncStage = "Finishing up...";
this.notifySyncStatusListeners();

// Schema normalization: bring any .codex/.source files that arrived in this
// sync up to CURRENT_SCHEMA_VERSION before the rest of the post-sync helpers
// (index rebuild, webview refresh) read them. This handles the clean
// fast-forward case where files came down without going through
// resolveCodexCustomMerge — the merge resolver path already calls the
// ladder, so this only does work when there were no conflicts.
try {
if (workspaceFolders && workspaceFolders.length > 0) {
const wsRoot = workspaceFolders[0].uri;
const touched = new Set<string>([
...syncResult.changedFiles,
...syncResult.newFiles,
]);
const notebookPaths = Array.from(touched).filter(
(p) => p.endsWith(".codex") || p.endsWith(".source")
);
if (notebookPaths.length > 0) {
let author = "anonymous";
try {
const authApi = await getAuthApi();
const userInfo = await authApi?.getUserInfo();
if (userInfo?.username) author = userInfo.username;
} catch (_) { /* ignore */ }

let migrated = 0;
for (const relPath of notebookPaths) {
const uri = vscode.Uri.joinPath(wsRoot, relPath);
const result = await bringNotebookToCurrentForFile(uri, { author });
if (result.migrated) migrated++;
}
if (migrated > 0) {
debug(`Schema-normalized ${migrated}/${notebookPaths.length} synced notebook(s)`);
}
}
}
} catch (error) {
console.error("[SyncManager] Error during post-sync schema normalization:", error);
}

// Check if comments.json was affected by the sync - if so, run targeted repair
const commentsWasChanged = syncResult.changedFiles.includes('.project/comments.json') ||
syncResult.newFiles.includes('.project/comments.json') ||
Expand Down
121 changes: 33 additions & 88 deletions src/projectManager/utils/merge/resolvers.ts
Original file line number Diff line number Diff line change
Expand Up @@ -11,12 +11,12 @@ import { normalizeProjectSwapInfo } from "../../../utils/projectSwapManager";
import { ProjectSwapInfo, ProjectSwapEntry, ProjectSwapUserEntry, RemoteUpdatingEntry } from "../../../../types";
import { NotebookCommentThread, NotebookComment, CustomNotebookCellData, CustomNotebookMetadata } from "../../../../types";
import { CommentsMigrator } from "../../../utils/commentsMigrationUtils";
import { CodexCell } from "@/utils/codexNotebookUtils";
import { CodexCellTypes, EditType } from "../../../../types/enums";
import { EditHistory, ValidationEntry, FileEditHistory, ProjectEditHistory, ProjectUserVersionEntry } from "../../../../types/index.d";
import { EditMapUtils, deduplicateFileMetadataEdits } from "../../../utils/editMapUtils";
import { normalizeAttachmentUrl } from "@/utils/pathUtils";
import { formatJsonForNotebookFile } from "../../../utils/notebookFileFormattingUtils";
import { bringNotebookToCurrent, CURRENT_SCHEMA_VERSION, SchemaNotebook } from "../schema";
import { ORPHANED_PROJECT_FILES } from "../../../utils/fileUtils";
import {
buildCellPositionContextMap,
Expand Down Expand Up @@ -632,31 +632,6 @@ export async function resolveConflictFile(
}
}

/**
* Helper function to check if content contains old format edits that need migration
*/
function needsEditHistoryMigration(content: string): boolean {
try {
const notebook = JSON.parse(content);
const cells: CodexCell[] = notebook.cells || [];

for (const cell of cells) {
if (cell.metadata?.edits && cell.metadata.edits.length > 0) {
for (const edit of cell.metadata.edits) {
// Check if this is an old format edit (has cellValue but no editMap)
if ((edit as any).cellValue !== undefined && !edit.editMap) {
return true;
}
}
}
}
return false;
} catch (error) {
debugLog("Error checking for migration need:", error);
return false;
}
}

/**
* Helper function to resolve metadata conflicts using edit history
* This function determines the latest edit for each metadata field and applies it
Expand Down Expand Up @@ -983,44 +958,6 @@ function applyEditToCell(cell: CustomNotebookCellData, edit: EditHistory): void
}
}

/**
* Helper function to migrate old format edits to new format in-place
*/
function migrateEditHistoryInContent(content: string): string {
try {
const notebook = JSON.parse(content);
const cells: CodexCell[] = notebook.cells || [];
let hasChanges = false;

for (const cell of cells) {
if (cell.metadata?.edits && cell.metadata.edits.length > 0) {
for (const edit of cell.metadata.edits as any) {
// Check if this is an old format edit (has cellValue but no editMap)
if (edit.cellValue !== undefined && !edit.editMap) {
// Migrate old format to new format
edit.value = edit.cellValue; // Move cellValue to value
edit.editMap = ["value"]; // Set editMap to point to value
delete edit.cellValue; // Remove old property
hasChanges = true;

debugLog(`Migrated edit in cell ${cell.metadata.id}: converted cellValue to value with editMap`);
}
}
}
}

if (hasChanges) {
debugLog("Edit history migration completed for content");
return JSON.stringify(notebook, null, 2);
}

return content;
} catch (error) {
debugLog("Error migrating edit history in content:", error);
return content;
}
}

function mergeTwoCellsUsingResolverLogic(
ourCell: CustomNotebookCellData,
theirCell: CustomNotebookCellData
Expand Down Expand Up @@ -1134,7 +1071,6 @@ export async function resolveCodexCustomMerge(
debugLog({ ourContent: ourContent.slice(0, 1000), theirContent: theirContent.slice(0, 1000) });
debugLog("Starting resolveCodexCustomMerge");

// Check if content needs migration and migrate if necessary
if (!ourContent) {
debugLog("No our content, returning their content");
return theirContent;
Expand All @@ -1144,32 +1080,36 @@ export async function resolveCodexCustomMerge(
return ourContent;
}

// Migrate content if needed
let migratedOurContent = ourContent;
let migratedTheirContent = theirContent;

const ourNeedsMigration = needsEditHistoryMigration(ourContent);
const theirNeedsMigration = needsEditHistoryMigration(theirContent);

if (ourNeedsMigration) {
debugLog("Migrating our content edit history format");
migratedOurContent = migrateEditHistoryInContent(ourContent);
}

if (theirNeedsMigration) {
debugLog("Migrating their content edit history format");
migratedTheirContent = migrateEditHistoryInContent(theirContent);
}

debugLog("Parsing notebook content");
const ourNotebook = JSON.parse(migratedOurContent);
const theirNotebook = JSON.parse(migratedTheirContent);
const ourCells: CustomNotebookCellData[] = ourNotebook.cells;
const theirCells: CustomNotebookCellData[] = theirNotebook.cells;
// Pre-ladder: shape is "anything that might be on disk" (could be v0/v1/legacy).
// We type as SchemaNotebook here so the migration ladder sees the structural
// superset; once `bringNotebookToCurrent` returns we know the notebook is at
// CURRENT_SCHEMA_VERSION and we narrow to the canonical types for the merge.
const ourNotebook: SchemaNotebook = JSON.parse(ourContent);
const theirNotebook: SchemaNotebook = JSON.parse(theirContent);

// Bring both sides up to CURRENT_SCHEMA_VERSION before merging so the merge
// logic only ever sees one shape. Files already at the current version
// short-circuit to a no-op. The schema ladder folds in the legacy
// cellValue → value/editMap transform that used to live as a one-shot
// helper here; future schema bumps will append further steps.
const mergeAuthorForLadder = process.env.CODEX_MERGE_USER || await getCurrentUserName();
await bringNotebookToCurrent(ourNotebook, { author: mergeAuthorForLadder });
await bringNotebookToCurrent(theirNotebook, { author: mergeAuthorForLadder });

// Post-ladder: the notebooks now match CURRENT_SCHEMA_VERSION, so the rest of
// the merge can read them through the canonical types. The casts are honest —
// the ladder runtime-validates the shape; TypeScript just doesn't see that.
// We route through `unknown` because the canonical types are stricter than
// the structural SchemaNotebook (e.g. CustomNotebookMetadata requires `id`).
const ourCells = (ourNotebook.cells ?? []) as unknown as CustomNotebookCellData[];
const theirCells = (theirNotebook.cells ?? []) as unknown as CustomNotebookCellData[];

// Extract and merge file-level metadata
const ourMetadata: CustomNotebookMetadata = ourNotebook.metadata || {};
const theirMetadata: CustomNotebookMetadata = theirNotebook.metadata || {};
const ourMetadata: CustomNotebookMetadata =
(ourNotebook.metadata as unknown as CustomNotebookMetadata) || ({} as CustomNotebookMetadata);
const theirMetadata: CustomNotebookMetadata =
(theirNotebook.metadata as unknown as CustomNotebookMetadata) || ({} as CustomNotebookMetadata);

// Initialize edits arrays if they don't exist
if (!ourMetadata.edits) {
Expand Down Expand Up @@ -1282,6 +1222,11 @@ export async function resolveCodexCustomMerge(
}
}

// Stamp the merged notebook at the current schema version. Both sides were
// brought to current at the top of this function, so the merged output is
// guaranteed to be at current too.
mergedMetadata.schemaVersion = CURRENT_SCHEMA_VERSION;

// Return the full notebook structure with merged cells and metadata
// (formatted consistently for `.codex`/`.source` file writes)
return formatJsonForNotebookFile(
Expand Down
69 changes: 69 additions & 0 deletions src/projectManager/utils/migrationUtils.ts
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ import bibleData from "../../../webviews/codex-webviews/src/assets/bible-books-l
import { resolveCodexCustomMerge, mergeDuplicateCellsUsingResolverLogic } from "./merge/resolvers";
import { atomicWriteUriText } from "../../utils/notebookSafeSaveUtils";
import { normalizeNotebookFileText, formatJsonForNotebookFile } from "../../utils/notebookFileFormattingUtils";
import { bringNotebookToCurrentForFile } from "./schema/file";

// FIXME: move notebook format migration here

Expand Down Expand Up @@ -4087,3 +4088,71 @@ export const migration_recoverTempFilesAndMergeDuplicates = async (context?: vsc
console.error("Error running temp files recovery and duplicate merge migration:", error);
}
};

/**
* Activation-time pass: scan every `.codex` and `.source` notebook in the workspace
* and bring it up to `CURRENT_SCHEMA_VERSION` via the shared schema migration ladder.
*
* There is no completion flag — the per-file `metadata.schemaVersion` field IS the
* truth, so the activation pass is just a fast read-only scan on a settled project
* (no writes when every file is already current). On first run after upgrade, it
* does the work; subsequent runs are nearly free.
*/
export const migration_normalizeAllNotebooksToCurrentSchema = async (
_context?: vscode.ExtensionContext
) => {
try {
const workspaceFolders = vscode.workspace.workspaceFolders;
if (!workspaceFolders || workspaceFolders.length === 0) {
return;
}

const files = await vscode.workspace.findFiles("**/*.{codex,source}");
if (files.length === 0) {
return;
}

let author = "anonymous";
try {
const authApi = await getAuthApi();
const userInfo = await authApi?.getUserInfo();
if (userInfo?.username) {
author = userInfo.username;
}
} catch (_) { /* ignore */ }

let migratedFiles = 0;
let scannedFiles = 0;
let aheadOfClientFiles = 0;

await vscode.window.withProgress(
{
location: vscode.ProgressLocation.Notification,
title: "Checking notebook schema versions...",
cancellable: false,
},
async (progress) => {
for (let i = 0; i < files.length; i++) {
const file = files[i];
progress.report({
message: `Processing file ${i + 1}/${files.length}`,
increment: 100 / files.length,
});

const result = await bringNotebookToCurrentForFile(file, { author });
scannedFiles++;
if (result.migrated) migratedFiles++;
if (result.aheadOfClient) aheadOfClientFiles++;
}
}
);

if (migratedFiles > 0 || aheadOfClientFiles > 0) {
debug(
`Schema normalization scan complete: ${migratedFiles}/${scannedFiles} migrated, ${aheadOfClientFiles} ahead of client.`
);
}
} catch (error) {
console.error("Error running schema normalization migration:", error);
}
};
Loading
Loading