Hi — first off, thank you for 2343f8e and the broader 10-language manifest-parsing work. It's a huge improvement for cross-package IMPORTS accuracy.
I'd like to propose extending that architecture to handle workspace-level structure in monorepos: declarations where a root manifest enumerates child packages via globs. Today those declarations are effectively ignored — each child manifest is parsed independently and workspace membership isn't a first-class concept in the pkgmap.
Specifically:
- npm workspaces:
package.json with "workspaces": ["packages/*"] (array form) or "workspaces": { "packages": [...] } (object form)
- pnpm workspaces:
pnpm-workspace.yaml with a top-level packages: list
- Cargo workspaces:
Cargo.toml with [workspace] members = [...]
Why this matters
In a monorepo, workspace structure affects import resolution ordering and correctness:
- The parallel extract sorts files by size descending, so a root
package.json may race with its members. Today a member parsing before the root is OK because pkgmap lookups are symmetric, but any future disambiguation logic (e.g., "prefer same-workspace match on name collision") needs workspace identity to be known first.
- pnpm's workspace manifest is not a package manifest at all — it's a sibling YAML file that isn't currently touched by any parser.
- Cargo workspaces declare
members = [...], but member crates' Cargo.toml files are parsed independently with no linkage back to the root.
Proposed approach (summary)
Extend cbm_pkgmap_try_parse with a sibling function cbm_workspace_try_detect that runs in a lightweight pre-scan phase. The pre-scan identifies workspace roots (basename match + tiny-scope field check) and populates a new cbm_workspaces_t global alongside g_pkgmap. The existing per-file parse loop is unchanged; workspace info becomes a fallback resolver in cbm_pipeline_resolve_module after the three prefix resolvers.
Design invariants:
- No new pipeline pass — extends the existing parse/merge/resolve architecture.
- Pre-scan is basename-only (no file IO beyond what would have happened anyway).
- Graceful degradation: corrupted or unknown workspace manifests fall through to per-package behavior.
- Pure C, follows CONTRIBUTING.md.
I have a detailed design doc ready to share if helpful, including data structure additions, insertion points (lines in pass_pkgmap.c), test cases modeled on your existing test_framework.h pattern, and a phased rollout (3 PRs — Foundation / Formats / Resolver).
Coordination with PR #243
I've read PR #243. It doesn't touch pass_pkgmap.c directly, so there's no file-level collision, but it introduces cbm_path_alias_map_t / cbm_tsconfig_collection_t loaded at pipeline start and modifies pipeline.c, pipeline_incremental.c, and pass_parallel.c — the same files a g_workspaces global would touch for init/cleanup and the incremental-reindex path.
To avoid churn for you, I'd prefer to either (a) wait for #243 to merge and rebase my work on top, or (b) coordinate code-path boundaries with the #243 author directly if you'd rather land them closer together. Happy with either.
Questions
- Does this direction align with your vision post-
2343f8e? I want to extend your architecture (pre-scan phase in cbm_pkgmap_build, sibling to cbm_pkgmap_try_parse), not replace it.
- Scope preference: phased or monolithic? I'd propose three PRs (npm first, then pnpm+Cargo, then an optional resolver if a concrete bug motivates it). Is that welcome, or do you prefer seeing the full picture in one PR for review?
- Naming/style conventions for workspace-related symbols? I've been assuming
cbm_workspace_* / cbm_workspaces_t to mirror your existing cbm_pkg_* / cbm_pkgmap_t pattern. Let me know if you'd prefer otherwise.
What you'll see next
If helpful for faster iteration, once you've signaled direction I can open a draft PR for Phase 1 (~250 LoC + tests, npm only) so there's something concrete to react to. Happy to wait on your signal here first.
I have a full design doc — insertion points, data structure, test cases — ready to share on request.
Thanks for the great work on 2343f8e and for maintaining this project.
Hi — first off, thank you for
2343f8eand the broader 10-language manifest-parsing work. It's a huge improvement for cross-package IMPORTS accuracy.I'd like to propose extending that architecture to handle workspace-level structure in monorepos: declarations where a root manifest enumerates child packages via globs. Today those declarations are effectively ignored — each child manifest is parsed independently and workspace membership isn't a first-class concept in the pkgmap.
Specifically:
package.jsonwith"workspaces": ["packages/*"](array form) or"workspaces": { "packages": [...] }(object form)pnpm-workspace.yamlwith a top-levelpackages:listCargo.tomlwith[workspace] members = [...]Why this matters
In a monorepo, workspace structure affects import resolution ordering and correctness:
package.jsonmay race with its members. Today a member parsing before the root is OK because pkgmap lookups are symmetric, but any future disambiguation logic (e.g., "prefer same-workspace match on name collision") needs workspace identity to be known first.members = [...], but member crates'Cargo.tomlfiles are parsed independently with no linkage back to the root.Proposed approach (summary)
Extend
cbm_pkgmap_try_parsewith a sibling functioncbm_workspace_try_detectthat runs in a lightweight pre-scan phase. The pre-scan identifies workspace roots (basename match + tiny-scope field check) and populates a newcbm_workspaces_tglobal alongsideg_pkgmap. The existing per-file parse loop is unchanged; workspace info becomes a fallback resolver incbm_pipeline_resolve_moduleafter the three prefix resolvers.Design invariants:
I have a detailed design doc ready to share if helpful, including data structure additions, insertion points (lines in
pass_pkgmap.c), test cases modeled on your existingtest_framework.hpattern, and a phased rollout (3 PRs — Foundation / Formats / Resolver).Coordination with PR #243
I've read PR #243. It doesn't touch
pass_pkgmap.cdirectly, so there's no file-level collision, but it introducescbm_path_alias_map_t/cbm_tsconfig_collection_tloaded at pipeline start and modifiespipeline.c,pipeline_incremental.c, andpass_parallel.c— the same files ag_workspacesglobal would touch for init/cleanup and the incremental-reindex path.To avoid churn for you, I'd prefer to either (a) wait for #243 to merge and rebase my work on top, or (b) coordinate code-path boundaries with the #243 author directly if you'd rather land them closer together. Happy with either.
Questions
2343f8e? I want to extend your architecture (pre-scan phase incbm_pkgmap_build, sibling tocbm_pkgmap_try_parse), not replace it.cbm_workspace_*/cbm_workspaces_tto mirror your existingcbm_pkg_*/cbm_pkgmap_tpattern. Let me know if you'd prefer otherwise.What you'll see next
If helpful for faster iteration, once you've signaled direction I can open a draft PR for Phase 1 (~250 LoC + tests, npm only) so there's something concrete to react to. Happy to wait on your signal here first.
I have a full design doc — insertion points, data structure, test cases — ready to share on request.
Thanks for the great work on
2343f8eand for maintaining this project.