Skip to content

Workspace support: package.json#workspaces, pnpm-workspace.yaml, and [workspace] in Cargo.toml — proposal #271

@Dialectician

Description

@Dialectician

Hi — first off, thank you for 2343f8e and the broader 10-language manifest-parsing work. It's a huge improvement for cross-package IMPORTS accuracy.

I'd like to propose extending that architecture to handle workspace-level structure in monorepos: declarations where a root manifest enumerates child packages via globs. Today those declarations are effectively ignored — each child manifest is parsed independently and workspace membership isn't a first-class concept in the pkgmap.

Specifically:

  • npm workspaces: package.json with "workspaces": ["packages/*"] (array form) or "workspaces": { "packages": [...] } (object form)
  • pnpm workspaces: pnpm-workspace.yaml with a top-level packages: list
  • Cargo workspaces: Cargo.toml with [workspace] members = [...]

Why this matters

In a monorepo, workspace structure affects import resolution ordering and correctness:

  1. The parallel extract sorts files by size descending, so a root package.json may race with its members. Today a member parsing before the root is OK because pkgmap lookups are symmetric, but any future disambiguation logic (e.g., "prefer same-workspace match on name collision") needs workspace identity to be known first.
  2. pnpm's workspace manifest is not a package manifest at all — it's a sibling YAML file that isn't currently touched by any parser.
  3. Cargo workspaces declare members = [...], but member crates' Cargo.toml files are parsed independently with no linkage back to the root.

Proposed approach (summary)

Extend cbm_pkgmap_try_parse with a sibling function cbm_workspace_try_detect that runs in a lightweight pre-scan phase. The pre-scan identifies workspace roots (basename match + tiny-scope field check) and populates a new cbm_workspaces_t global alongside g_pkgmap. The existing per-file parse loop is unchanged; workspace info becomes a fallback resolver in cbm_pipeline_resolve_module after the three prefix resolvers.

Design invariants:

  • No new pipeline pass — extends the existing parse/merge/resolve architecture.
  • Pre-scan is basename-only (no file IO beyond what would have happened anyway).
  • Graceful degradation: corrupted or unknown workspace manifests fall through to per-package behavior.
  • Pure C, follows CONTRIBUTING.md.

I have a detailed design doc ready to share if helpful, including data structure additions, insertion points (lines in pass_pkgmap.c), test cases modeled on your existing test_framework.h pattern, and a phased rollout (3 PRs — Foundation / Formats / Resolver).

Coordination with PR #243

I've read PR #243. It doesn't touch pass_pkgmap.c directly, so there's no file-level collision, but it introduces cbm_path_alias_map_t / cbm_tsconfig_collection_t loaded at pipeline start and modifies pipeline.c, pipeline_incremental.c, and pass_parallel.c — the same files a g_workspaces global would touch for init/cleanup and the incremental-reindex path.

To avoid churn for you, I'd prefer to either (a) wait for #243 to merge and rebase my work on top, or (b) coordinate code-path boundaries with the #243 author directly if you'd rather land them closer together. Happy with either.

Questions

  1. Does this direction align with your vision post-2343f8e? I want to extend your architecture (pre-scan phase in cbm_pkgmap_build, sibling to cbm_pkgmap_try_parse), not replace it.
  2. Scope preference: phased or monolithic? I'd propose three PRs (npm first, then pnpm+Cargo, then an optional resolver if a concrete bug motivates it). Is that welcome, or do you prefer seeing the full picture in one PR for review?
  3. Naming/style conventions for workspace-related symbols? I've been assuming cbm_workspace_* / cbm_workspaces_t to mirror your existing cbm_pkg_* / cbm_pkgmap_t pattern. Let me know if you'd prefer otherwise.

What you'll see next

If helpful for faster iteration, once you've signaled direction I can open a draft PR for Phase 1 (~250 LoC + tests, npm only) so there's something concrete to react to. Happy to wait on your signal here first.

I have a full design doc — insertion points, data structure, test cases — ready to share on request.

Thanks for the great work on 2343f8e and for maintaining this project.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions