feat: support global config fallback and configurable embedding batch size#25
Conversation
- loadConfig: fall back to ~/.claude/arch (or SOCRATICODE_GLOBAL_CONFIG_DIR env var) when project root has no .socraticodecontextartifacts.json - BATCH_SIZE: configurable via EMBEDDING_BATCH_SIZE env var, default 32 (validates positive integer, throws on invalid input)
📝 WalkthroughWalkthroughloadConfig now falls back to a global config file (env: SOCRATICODE_GLOBAL_CONFIG_DIR, default ~/.claude/arch) when a project-level Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/services/context-artifacts.ts`:
- Around line 56-61: The global config fallback currently sets
globalConfigDir/globalConfigPath and leaves artifact.path values as-is, causing
relative artifact.path to later be resolved against the project root; update the
loader in src/services/context-artifacts.ts so that when actualPath equals
globalConfigPath (or when loading from globalConfigDir) any relative
artifact.path values are resolved immediately against globalConfigDir (use
path.isAbsolute to detect relative paths and path.resolve(globalConfigDir,
artifact.path) to normalize), so downstream resolution logic that assumes
project-root resolution (the existing artifact resolution code) receives
already-correct absolute paths.
In `@src/services/embeddings.ts`:
- Around line 12-16: The EMBEDDING_BATCH_SIZE parsing currently uses
Number.parseInt which allows "64abc" and "1.5" and treats empty string as unset;
update the validation so that when raw is undefined/null you still return the
default 32, but otherwise require raw to be a non-empty string of only digits
(e.g. /^\d+$/) before parsing; then parse with Number.parseInt(raw, 10), ensure
parsed > 0, and throw a clear Error referencing EMBEDDING_BATCH_SIZE/raw/parsed
if the value is empty, contains non-digits, is a float-like string, or is
non-positive.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 102f1e2f-780e-4909-b038-199a461d1c3a
📒 Files selected for processing (2)
src/services/context-artifacts.tssrc/services/embeddings.ts
| if (!raw) return 32; | ||
| const parsed = Number.parseInt(raw, 10); | ||
| if (!Number.isFinite(parsed) || parsed <= 0) { | ||
| throw new Error( | ||
| `Invalid EMBEDDING_BATCH_SIZE: "${raw}". Must be a positive integer.`, |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
`In JavaScript, what are the exact results of:
- Number.parseInt("1.5", 10)
- Number.parseInt("64abc", 10)
- Number.parseInt("", 10)
And does parseInt enforce full-string integer validation?`
💡 Result:
Number.parseInt("1.5", 10) returns 1 Number.parseInt("64abc", 10) returns 64 Number.parseInt("", 10) returns NaN parseInt does not enforce full-string integer validation. It parses the longest prefix of valid integer digits in the specified radix and ignores subsequent invalid characters.
Citations:
- 1: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Number/parseInt
- 2: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/parseInt
- 3: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/parseInt.
- 4: https://javascript.plainenglish.io/javascript-parseint-unveiling-unexpected-behaviors-and-rules-329cacdd0c02
- 5: https://www.techonthenet.com/js/number_parseint.php
🏁 Script executed:
cat -n src/services/embeddings.ts | head -25Repository: giancarloerra/SocratiCode
Length of output: 1084
Strict integer validation is bypassable for EMBEDDING_BATCH_SIZE.
Current parsing accepts non-integer/partial inputs and treats empty string as unset:
Number.parseInt("64abc", 10)returns64(ignores trailing chars)Number.parseInt("1.5", 10)returns1(truncates decimal)- Empty string falls back to default 32 instead of throwing
This breaks the stated "positive integer" contract.
Proposed fix
const BATCH_SIZE: number = (() => {
const raw = process.env.EMBEDDING_BATCH_SIZE;
- if (!raw) return 32;
- const parsed = Number.parseInt(raw, 10);
- if (!Number.isFinite(parsed) || parsed <= 0) {
+ if (raw === undefined) return 32;
+ const normalized = raw.trim();
+ if (!/^[1-9]\d*$/.test(normalized)) {
throw new Error(
`Invalid EMBEDDING_BATCH_SIZE: "${raw}". Must be a positive integer.`,
);
}
+ const parsed = Number(normalized);
return parsed;
})();🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/services/embeddings.ts` around lines 12 - 16, The EMBEDDING_BATCH_SIZE
parsing currently uses Number.parseInt which allows "64abc" and "1.5" and treats
empty string as unset; update the validation so that when raw is undefined/null
you still return the default 32, but otherwise require raw to be a non-empty
string of only digits (e.g. /^\d+$/) before parsing; then parse with
Number.parseInt(raw, 10), ensure parsed > 0, and throw a clear Error referencing
EMBEDDING_BATCH_SIZE/raw/parsed if the value is empty, contains non-digits, is a
float-like string, or is non-positive.
|
@sb-giithub thanks for the PR, both features are useful additions. A couple of things need addressing before this can be merged (flagged by Code Rabbit): 1. Relative artifact paths break with global config fallback When You'll need to either resolve relative paths against the global config directory at load time, or propagate the config source directory so downstream code resolves correctly. 2.
const num = Number(raw);
if (!Number.isInteger(num) || num <= 0) {
throw new Error(`Invalid EMBEDDING_BATCH_SIZE: "${raw}". Must be a positive integer.`);
}
return num; |
…ch size validation 1. Global config fallback: relative artifact paths are now resolved against the global config directory (not the project root) when loading from fallback. Absolute paths are left unchanged. Project-level config behavior is unaffected. 2. EMBEDDING_BATCH_SIZE: replaced Number.parseInt with Number() + Number.isInteger() so that inputs like "64abc", "1.5", and "" are correctly rejected instead of silently accepted. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Both issues fixed in 49b5b35: 1. Relative artifact paths with global config fallback Added a 2. Stricter Replaced
Added tests covering both fixes, including the existing "returns null when no config" test which now isolates itself from real global configs via env var override. |
There was a problem hiding this comment.
🧹 Nitpick comments (2)
tests/unit/embedding-batch-size.test.ts (1)
12-21: Avoid duplicating parsing logic in test code.
parseBatchSizemirrors production logic, so test and implementation can drift together. Prefer sharing one parser implementation (e.g., internal util used by bothsrc/services/embeddings.tsand this test).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/embedding-batch-size.test.ts` around lines 12 - 21, The test defines a local parseBatchSize function that duplicates production parsing logic; remove this duplication by importing and using the canonical parser from the production code (the parser used by src/services/embeddings.ts) instead of the local parseBatchSize; update the test to import the shared function (or an exported utility) and assert behavior through that single implementation so tests and production stay in sync, ensuring any test-specific defaults or error messages match the exported function's behavior.tests/unit/context-artifacts.test.ts (1)
45-57: Consider extracting env override/restore into a helper.The
SOCRATICODE_GLOBAL_CONFIG_DIRsave/restore block is duplicated in several tests; a small helper would make this suite easier to maintain.♻️ Optional refactor sketch
+async function withGlobalConfigDir<T>(dir: string, run: () => Promise<T>): Promise<T> { + const originalEnv = process.env.SOCRATICODE_GLOBAL_CONFIG_DIR; + process.env.SOCRATICODE_GLOBAL_CONFIG_DIR = dir; + try { + return await run(); + } finally { + if (originalEnv === undefined) { + delete process.env.SOCRATICODE_GLOBAL_CONFIG_DIR; + } else { + process.env.SOCRATICODE_GLOBAL_CONFIG_DIR = originalEnv; + } + } +}Also applies to: 175-188, 205-221, 257-269
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/context-artifacts.test.ts` around lines 45 - 57, Extract the repeated save/override/restore logic for process.env.SOCRATICODE_GLOBAL_CONFIG_DIR used around tests that call loadConfig into a small test helper (e.g., withTemporaryEnv or overrideEnv) that accepts the env key and temporary value and a callback/async function to run; replace the duplicated try/finally blocks in tests (the blocks around loadConfig at lines shown) with calls to this helper so it sets the env var to path.join(tempDir, "nonexistent-global"), runs the assertion (expect(config).toBeNull()) and then reliably restores the original environment whether the callback throws or resolves.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@tests/unit/context-artifacts.test.ts`:
- Around line 45-57: Extract the repeated save/override/restore logic for
process.env.SOCRATICODE_GLOBAL_CONFIG_DIR used around tests that call loadConfig
into a small test helper (e.g., withTemporaryEnv or overrideEnv) that accepts
the env key and temporary value and a callback/async function to run; replace
the duplicated try/finally blocks in tests (the blocks around loadConfig at
lines shown) with calls to this helper so it sets the env var to
path.join(tempDir, "nonexistent-global"), runs the assertion
(expect(config).toBeNull()) and then reliably restores the original environment
whether the callback throws or resolves.
In `@tests/unit/embedding-batch-size.test.ts`:
- Around line 12-21: The test defines a local parseBatchSize function that
duplicates production parsing logic; remove this duplication by importing and
using the canonical parser from the production code (the parser used by
src/services/embeddings.ts) instead of the local parseBatchSize; update the test
to import the shared function (or an exported utility) and assert behavior
through that single implementation so tests and production stay in sync,
ensuring any test-specific defaults or error messages match the exported
function's behavior.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 1176d5b4-209a-44f9-892d-64eb5ec20c49
📒 Files selected for processing (4)
src/services/context-artifacts.tssrc/services/embeddings.tstests/unit/context-artifacts.test.tstests/unit/embedding-batch-size.test.ts
🚧 Files skipped from review as they are similar to previous changes (1)
- src/services/context-artifacts.ts
|
@sb-giithub thanks, merged! |
Summary
Two small, independent improvements that make SocratiCode easier to share knowledge across projects and to tune for different hardware:
Global fallback for
.socraticodecontextartifacts.json— when a project root has no config file, fall back to a global location. Configurable viaSOCRATICODE_GLOBAL_CONFIG_DIRenv var; defaults to~/.claude/arch/. Lets users maintain a shared knowledge base (architecture docs, SDK conventions, etc.) once and reuse it across all projects without duplicating config files.EMBEDDING_BATCH_SIZEenv var — the hardcodedBATCH_SIZE = 32inembeddings.tsis now configurable. Default unchanged (32). Users with high-throughput embedding providers (TEI on GPU, OpenAI, etc.) can raise it (e.g. 128) for better throughput; users on constrained hardware can lower it.Both changes are fully backwards-compatible — without setting either env var, behavior is identical to current main.
Changes
src/services/context-artifacts.ts:loadConfig()falls back to global config dir when project-level file is absentsrc/services/embeddings.ts:BATCH_SIZEreads fromEMBEDDING_BATCH_SIZEenv var with validation (positive integer, throws on invalid)Test plan
SOCRATICODE_GLOBAL_CONFIG_DIRenv var override worksEMBEDDING_BATCH_SIZE=64loads correctlyEMBEDDING_BATCH_SIZE=invalidthrows clear errornpm run buildpassesSummary by CodeRabbit
New Features
Tests