Skip to content

feat(memory): add Strategy Optimizer with A/B experiment framework#567

Open
nikolasdehor wants to merge 3 commits intoSynkraAI:mainfrom
nikolasdehor:feat/strategy-optimizer
Open

feat(memory): add Strategy Optimizer with A/B experiment framework#567
nikolasdehor wants to merge 3 commits intoSynkraAI:mainfrom
nikolasdehor:feat/strategy-optimizer

Conversation

@nikolasdehor
Copy link
Contributor

@nikolasdehor nikolasdehor commented Mar 7, 2026

Resumo

Story 9.7 do Epic 9 (Persistent Memory Layer). Integra com Decision Memory (#564) e Reflection Engine (#565).

O Strategy Optimizer implementa um framework de testes A/B para otimizar estratégias de agentes com evidência estatística.

Funcionalidades

  • Experimentos A/B: Defina variantes de estratégia e compare com dados reais
  • Atribuição ponderada: Weighted random para distribuir tarefas entre variantes
  • Coleta de resultados: Sucesso/falha, duração, metadata customizada
  • Significância estatística: Z-test de proporções com confiança configurável
  • Auto-conclusão: Promove vencedor automaticamente quando confiança atingida
  • Registro de melhores estratégias: Mapa taskType→strategy persistente
  • Controle de concorrência: Limita experimentos simultâneos

Exemplo de Uso

const optimizer = new StrategyOptimizer({ projectRoot: '.' });
await optimizer.load();

const exp = optimizer.createExperiment({
  name: 'TDD vs Hack-First',
  taskType: 'implementation',
  variants: ['tdd', 'hack-first'],
});

// During execution:
const variant = optimizer.assignVariant(exp.id);
// ... execute with variant strategy ...
optimizer.recordResult({ experimentId: exp.id, variant, success: true });

// After enough data:
const best = optimizer.getBestStrategy('implementation'); // → 'tdd'

Arquivos

Arquivo Descrição
.aios-core/core/memory/strategy-optimizer.js Módulo (~400 linhas)
tests/core/memory/strategy-optimizer.test.js 41 testes unitários

Test plan

  • npx jest tests/core/memory/strategy-optimizer.test.js — 41/41 passing
  • Integrar com Reflection Engine para alimentar experimentos automaticamente (trabalho futuro, issue separada)
  • CLI command aiox strategy para gerenciar experimentos (trabalho futuro, issue separada)

Summary by CodeRabbit

  • New Features

    • Strategy Optimizer: end-to-end A/B experiment management (create experiments, assign variants, record results, auto-conclude and promote winning strategies).
  • Bug Fixes

    • Improved agent validation to reject invalid or non-string agent identifiers.
  • Tests

    • Added comprehensive tests for Strategy Optimizer and skill dispatching to validate lifecycle, persistence, assignment, result recording, auto-conclusion, parsing, and payload construction.

Copilot AI review requested due to automatic review settings March 7, 2026 02:24
@vercel
Copy link

vercel bot commented Mar 7, 2026

@nikolasdehor is attempting to deploy a commit to the Pedro Valério Lopez's projects Team on Vercel.

A member of the Team first needs to authorize it.

@coderabbitai
Copy link

coderabbitai bot commented Mar 7, 2026

Walkthrough

Adds a new StrategyOptimizer module for A/B experiment orchestration (persistence, variant assignment, result recording, statistical conclusion and automatic promotion), extensive tests for the optimizer and SkillDispatcher, a small input-validation fix in SkillDispatcher, and an updated install manifest metadata.

Changes

Cohort / File(s) Summary
StrategyOptimizer Implementation
/.aios-core/core/memory/strategy-optimizer.js
New module implementing experiment lifecycle (create, cancel), weighted variant assignment, result recording, persistence to .aiox/strategy-experiments.json, confidence evaluation and automatic promotion. Exports StrategyOptimizer, ExperimentStatus, Events, and CONFIG; emits lifecycle events.
StrategyOptimizer Tests
tests/core/memory/strategy-optimizer.test.js
New, extensive unit tests covering constructor/config, persistence (load/save, schema handling), experiment lifecycle, assignment, result recording, auto-conclusion and promotion, best-strategy APIs, cancellation, listing and stats.
SkillDispatcher Tests
tests/core/orchestration/skill-dispatcher.test.js
Added tests exercising skill mapping, alias resolution, persona lookup, dispatch payload construction, output parsing (markdown/JSON/plain), logging formatting, agent listing/validation, and skip/result formatting.
SkillDispatcher Small Fix
/.aiox-core/core/orchestration/skill-dispatcher.js
Added input validation in isValidAgent to return false for falsy or non-string agentId before existing checks.
Install Manifest Update
/.aiox-core/install-manifest.yaml
Updated generated_at timestamp and refreshed file-level hashes/sizes across many assets (metadata only).

Sequence Diagram(s)

sequenceDiagram
    participant User as User
    participant SO as StrategyOptimizer
    participant Exp as Experiment
    participant Stats as ConfidenceCalc
    participant Store as FileStore
    participant Events as EventEmitter

    User->>SO: createExperiment(config)
    SO->>Exp: initialize experiment & variants
    SO->>Events: emit EXPERIMENT_CREATED
    SO->>Store: persist state

    loop experiment lifecycle
        User->>SO: assignVariant(expId)
        SO->>Exp: weighted selection
        SO-->>User: return variant

        User->>SO: recordResult(expId, variantId, success)
        SO->>Exp: update counters
        SO->>Events: emit RESULT_RECORDED

        SO->>Stats: _computeConfidence(pairwise)
        Stats-->>SO: confidence scores
        alt threshold met
            SO->>Exp: mark CONCLUDED
            SO->>SO: setBestStrategy(winner)
            SO->>Events: emit CONCLUDED & PROMOTED
            SO->>Store: persist updated state
        end
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the main addition: a Strategy Optimizer module implementing an A/B experiment framework for memory optimization.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new persistent-memory component (“Strategy Optimizer”) intended to run A/B experiments on agent strategies, record outcomes, and promote statistically “best” strategies, plus introduces a new unit test suite for the orchestration SkillDispatcher.

Changes:

  • Added StrategyOptimizer module with experiment lifecycle (create/assign/record/conclude), persistence, and event emission.
  • Added comprehensive Jest unit tests for StrategyOptimizer.
  • Added a new Jest unit test suite for SkillDispatcher.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 10 comments.

File Description
tests/core/orchestration/skill-dispatcher.test.js Adds unit tests for SkillDispatcher behavior (agent mapping, payload building, output parsing, logging).
tests/core/memory/strategy-optimizer.test.js Adds unit tests covering StrategyOptimizer persistence, experiment creation, assignment, result recording, auto-conclusion, and stats.
.aios-core/core/memory/strategy-optimizer.js Implements StrategyOptimizer with persistence to .aiox/strategy-experiments.json, weighted assignment, and auto-promotion logic.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

* invocations and handles dispatch payloads and result parsing.
*/

const SkillDispatcher = require('../../../.aios-core/core/orchestration/skill-dispatcher');
Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test imports SkillDispatcher from ".aios-core/...", but the implementation in this repo lives under ".aiox-core/core/orchestration/skill-dispatcher.js". As written, this test will fail with a module-not-found error; update the require path to the correct directory.

Suggested change
const SkillDispatcher = require('../../../.aios-core/core/orchestration/skill-dispatcher');
const SkillDispatcher = require('../../../.aiox-core/core/orchestration/skill-dispatcher');

Copilot uses AI. Check for mistakes.
Comment on lines +43 to +46
test('has aliases in skill mapping', () => {
expect(dispatcher.skillMapping['ux-expert']).toBe('AIOS:agents:ux-design-expert');
expect(dispatcher.skillMapping['github-devops']).toBe('AIOS:agents:devops');
});
Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The expectations here use the "AIOS:agents:" skill prefix, but the current SkillDispatcher implementation returns "AIOX:agents:" (see .aiox-core/core/orchestration/skill-dispatcher.js). Update the expected strings (and any related assertions) to match the real prefix to avoid test failures.

Copilot uses AI. Check for mistakes.
Comment on lines +410 to +412
// Simple sigmoid-like mapping: z=1.65→90%, z=1.96→95%, z=2.58→99%
const confidence = 1 / (1 + Math.exp(-1.7 * (z - 1)));
return Math.min(1.0, Math.max(0, confidence));
Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR description mentions a Z-test of proportions with a configurable confidence level, but the implementation maps the z-score through a custom sigmoid (1/(1+exp(-1.7*(z-1)))) rather than computing a p-value / confidence from the normal CDF. This makes confidenceThreshold non-standard (e.g., 0.95 won’t correspond to 95% significance). Consider implementing a proper two-proportion z-test (p-value) or updating the docs to match the actual method.

Copilot uses AI. Check for mistakes.
const primary = [
'architect', 'data-engineer', 'dev', 'qa',
'pm', 'po', 'sm', 'analyst',
'ux-design-expert', 'devops', 'aios-master',
Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The expected primary agent list includes aios-master, but the current SkillDispatcher implementation uses aiox-master (and PRIMARY_AGENTS is keyed on that). Update the expected agent ID here to match the implementation so this test doesn’t fail.

Suggested change
'ux-design-expert', 'devops', 'aios-master',
'ux-design-expert', 'devops', 'aiox-master',

Copilot uses AI. Check for mistakes.
Comment on lines +242 to +246
if (result.success) variant.successCount++;
else variant.failureCount++;

if (result.durationMs) variant.totalDurationMs += result.durationMs;

Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (result.durationMs) variant.totalDurationMs += result.durationMs; will skip accumulating when durationMs is 0. Use a numeric/nullish check (e.g., typeof durationMs === 'number') to ensure accumulation logic matches the stored value.

Copilot uses AI. Check for mistakes.
Comment on lines +167 to +170
variants: experiment.variants.map((name, i) => ({
name,
weight: weights[i] || 1,
results: [],
Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

weight: weights[i] || 1 will silently turn an explicit weight of 0 into 1. Use nullish coalescing (weights[i] ?? 1) and validate experiment.weights entries are finite numbers (and non-negative) to avoid breaking weighted assignment.

Copilot uses AI. Check for mistakes.

const CONFIG = {
experimentsPath: '.aiox/strategy-experiments.json',
maxExperiments: 50,
Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CONFIG.maxExperiments is defined but never enforced, so this.experiments (and the persisted JSON) can grow indefinitely. Either enforce this cap/prune old experiments or remove the unused config to avoid misleading configuration and unbounded disk growth over time.

Suggested change
maxExperiments: 50,

Copilot uses AI. Check for mistakes.
Comment on lines +236 to +240
success: result.success,
durationMs: result.durationMs || null,
metadata: result.metadata || null,
recordedAt: new Date().toISOString(),
});
Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

durationMs: result.durationMs || null will convert a valid 0 duration into null. Use nullish coalescing (??) so 0 is preserved, and consider validating that durationMs is a finite number before storing it.

Copilot uses AI. Check for mistakes.
Comment on lines +87 to +91
async load() {
const filePath = this._getFilePath();
try {
if (fs.existsSync(filePath)) {
const raw = fs.readFileSync(filePath, 'utf8');
Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When fs.existsSync(filePath) is false, load() leaves any existing in-memory experiments/bestStrategies intact. Consider explicitly resetting to an empty state when the file is missing so load() behaves predictably if called on a reused instance.

Copilot uses AI. Check for mistakes.
*/
cancelExperiment(experimentId) {
const exp = this.experiments.find((e) => e.id === experimentId);
if (!exp) throw new Error(`Unknown experiment: ${experimentId}`);
Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cancelExperiment() is described as cancelling a running experiment, but it will currently cancel experiments in any status (including concluded) and overwrite concludedAt. Add a status check and throw (or no-op) when the experiment is not running to prevent accidental state corruption.

Suggested change
if (!exp) throw new Error(`Unknown experiment: ${experimentId}`);
if (!exp) throw new Error(`Unknown experiment: ${experimentId}`);
if (exp.status !== ExperimentStatus.RUNNING) {
throw new Error(
`Cannot cancel experiment ${experimentId} because it is not running (status: ${exp.status})`
);
}

Copilot uses AI. Check for mistakes.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🧹 Nitpick comments (1)
tests/core/memory/strategy-optimizer.test.js (1)

10-10: Prefer the repo's absolute import for StrategyOptimizer.

This deep relative path is easy to break during moves and does not follow the shared JS import convention.

As per coding guidelines, **/*.{js,jsx,ts,tsx}: Use absolute imports instead of relative imports in all code.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/core/memory/strategy-optimizer.test.js` at line 10, Replace the deep
relative require for StrategyOptimizer with the repository's canonical absolute
import; locate the require call that imports StrategyOptimizer (currently
"require('../../../.aios-core/core/memory/strategy-optimizer')") and change it
to the project's absolute module path (e.g.,
require('core/memory/strategy-optimizer') or the repo alias used elsewhere) so
tests use the shared absolute import convention.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.aios-core/core/memory/strategy-optimizer.js:
- Around line 285-291: The cancelExperiment method currently allows cancelling
any experiment state; change it to only permit cancellation when the experiment
is in ExperimentStatus.RUNNING: inside cancelExperiment (on
this.experiments.find result) check exp.status === ExperimentStatus.RUNNING and
throw a descriptive Error if not, so concluded or already-cancelled experiments
cannot be rewritten; only after that set exp.status =
ExperimentStatus.CANCELLED, exp.concludedAt = new Date().toISOString(), and emit
Events.EXPERIMENT_CANCELLED.
- Around line 221-245: The recordResult logic must enforce that result.success
is strictly a boolean and must preserve 0 values for durationMs; update
recordResult to validate with typeof result.success !== 'boolean' (throw if not
boolean), push the result using presence checks (use 'durationMs' in result or
Object.prototype.hasOwnProperty.call(result, "durationMs") to store durationMs
or null, and similarly for metadata), update counters with explicit checks (if
result.success === true increment successCount; else if result.success === false
increment failureCount), and add to variant.totalDurationMs only when durationMs
is present (i.e., not undefined/null) so that a 0 ms sample is included in
totals/averages; apply the same strict checks in the other occurrence of this
logic (the block around the second occurrence noted in the review).
- Around line 89-107: The current load block around filePath swallows all errors
and resets this.experiments and this.bestStrategies even for transient FS/read
errors; change the error handling so only parse/schema-version failures
clear/reset the in-memory registry while filesystem/read errors are surfaced
(rethrow or log and return) to avoid accidental history wipe on later save();
specifically, keep the fs.existsSync + fs.readFileSync + JSON.parse flow, handle
SyntaxError (JSON.parse) and schemaVersion mismatch by resetting
this.experiments and this.bestStrategies and setting this._loaded, but for other
exceptions (fs errors, permission errors) do not reset state—rethrow or
propagate the error to the caller (or log and return) so save() won’t overwrite
valid history.
- Around line 145-170: Before constructing the experiment entry, validate the
variants and weights: ensure each variant in experiment.variants is a non-empty
string and that all names are unique (to avoid recordResult ambiguity), ensure
weights (either experiment.weights or generated weights) has the same length as
variants and that every weight is a finite number > 0 (no 0, negative, NaN, or
Infinity). If any check fails, throw a clear Error (e.g., "Invalid variant
names" or "Invalid weights: must be positive finite numbers matching variants
length") so the experiment is rejected before the entry is created in the
experiment creation code that builds id/name/taskType/variants and before using
recordResult.

In `@tests/core/orchestration/skill-dispatcher.test.js`:
- Around line 44-45: The test expectations use the wrong skill prefix; update
the assertions that reference dispatcher.skillMapping (e.g., the lines asserting
'ux-expert' and 'github-devops' and the other failing expectations at 54-65 and
127) to use the correct 'AIOX:' prefix instead of 'AIOS:' so they match the
source-of-truth mapping in skill-dispatcher.js; search for any other occurrences
of "AIOS:agents:" in the test file and replace them with "AIOX:agents:" to keep
all expectations consistent with dispatcher.skillMapping.
- Line 8: Replace the broken relative require for SkillDispatcher in
tests/core/orchestration/skill-dispatcher.test.js with an absolute import that
matches the repo convention and actual package location (e.g.,
require('core/orchestration/skill-dispatcher') or the correct package alias used
by the project), so update the import that currently points to
../../../.aios-core/... to the absolute module path for SkillDispatcher and
ensure the module name matches the exported identifier SkillDispatcher.

---

Nitpick comments:
In `@tests/core/memory/strategy-optimizer.test.js`:
- Line 10: Replace the deep relative require for StrategyOptimizer with the
repository's canonical absolute import; locate the require call that imports
StrategyOptimizer (currently
"require('../../../.aios-core/core/memory/strategy-optimizer')") and change it
to the project's absolute module path (e.g.,
require('core/memory/strategy-optimizer') or the repo alias used elsewhere) so
tests use the shared absolute import convention.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: f3b6f55b-49ca-40fa-b514-afd9e644aacf

📥 Commits

Reviewing files that changed from the base of the PR and between fcfb757 and 8f8252b.

📒 Files selected for processing (3)
  • .aios-core/core/memory/strategy-optimizer.js
  • tests/core/memory/strategy-optimizer.test.js
  • tests/core/orchestration/skill-dispatcher.test.js

Comment on lines +89 to +107
try {
if (fs.existsSync(filePath)) {
const raw = fs.readFileSync(filePath, 'utf8');
const data = JSON.parse(raw);

if (data.schemaVersion !== this.config.schemaVersion) {
this.experiments = [];
this.bestStrategies = new Map();
this._loaded = true;
return;
}

this.experiments = Array.isArray(data.experiments) ? data.experiments : [];
this.bestStrategies = new Map(Object.entries(data.bestStrategies || {}));
}
} catch {
this.experiments = [];
this.bestStrategies = new Map();
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't collapse every load failure into an empty registry.

A transient read or permission error currently looks the same as “no experiments”, and a later successful save() can wipe valid history. Reset on parse/schema problems, but surface filesystem failures instead.

As per coding guidelines, **/*.js: Verify error handling is comprehensive.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.aios-core/core/memory/strategy-optimizer.js around lines 89 - 107, The
current load block around filePath swallows all errors and resets
this.experiments and this.bestStrategies even for transient FS/read errors;
change the error handling so only parse/schema-version failures clear/reset the
in-memory registry while filesystem/read errors are surfaced (rethrow or log and
return) to avoid accidental history wipe on later save(); specifically, keep the
fs.existsSync + fs.readFileSync + JSON.parse flow, handle SyntaxError
(JSON.parse) and schemaVersion mismatch by resetting this.experiments and
this.bestStrategies and setting this._loaded, but for other exceptions (fs
errors, permission errors) do not reset state—rethrow or propagate the error to
the caller (or log and return) so save() won’t overwrite valid history.

Comment on lines +145 to +170
if (!experiment.name || !experiment.taskType || !experiment.variants) {
throw new Error('Required fields: name, taskType, variants');
}
if (!Array.isArray(experiment.variants) || experiment.variants.length < 2) {
throw new Error('variants must be an array with at least 2 entries');
}

// Check concurrent limit
const running = this.experiments.filter((e) => e.status === ExperimentStatus.RUNNING).length;
if (running >= this.config.maxConcurrentExperiments) {
throw new Error(
`Max concurrent experiments (${this.config.maxConcurrentExperiments}) reached`,
);
}

const weights = experiment.weights || experiment.variants.map(() => 1);

const entry = {
id: `exp_${Date.now()}_${Math.random().toString(36).slice(2, 6)}`,
name: experiment.name,
taskType: experiment.taskType,
description: experiment.description || '',
variants: experiment.variants.map((name, i) => ({
name,
weight: weights[i] || 1,
results: [],
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Validate variant names and weights before creating the experiment.

Duplicate variant names make recordResult() ambiguous, and malformed weights ([1], 0, negative, NaN) silently bias assignment instead of failing fast.

Suggested guardrails
   if (!Array.isArray(experiment.variants) || experiment.variants.length < 2) {
     throw new Error('variants must be an array with at least 2 entries');
   }
+  if (experiment.variants.some((name) => typeof name !== 'string' || !name.trim())) {
+    throw new Error('variants must contain non-empty names');
+  }
+  if (new Set(experiment.variants).size !== experiment.variants.length) {
+    throw new Error('variants must be unique');
+  }
 
   const weights = experiment.weights || experiment.variants.map(() => 1);
+  if (
+    !Array.isArray(weights) ||
+    weights.length !== experiment.variants.length ||
+    weights.some((weight) => !Number.isFinite(weight) || weight <= 0)
+  ) {
+    throw new Error('weights must contain one positive number per variant');
+  }
@@
-        weight: weights[i] || 1,
+        weight: weights[i],
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.aios-core/core/memory/strategy-optimizer.js around lines 145 - 170, Before
constructing the experiment entry, validate the variants and weights: ensure
each variant in experiment.variants is a non-empty string and that all names are
unique (to avoid recordResult ambiguity), ensure weights (either
experiment.weights or generated weights) has the same length as variants and
that every weight is a finite number > 0 (no 0, negative, NaN, or Infinity). If
any check fails, throw a clear Error (e.g., "Invalid variant names" or "Invalid
weights: must be positive finite numbers matching variants length") so the
experiment is rejected before the entry is created in the experiment creation
code that builds id/name/taskType/variants and before using recordResult.

Comment on lines +221 to +245
recordResult(result) {
if (!result.experimentId || !result.variant || result.success === undefined) {
throw new Error('Required fields: experimentId, variant, success');
}

const exp = this.experiments.find((e) => e.id === result.experimentId);
if (!exp) throw new Error(`Unknown experiment: ${result.experimentId}`);
if (exp.status !== ExperimentStatus.RUNNING) {
throw new Error(`Experiment ${result.experimentId} is not running`);
}

const variant = exp.variants.find((v) => v.name === result.variant);
if (!variant) throw new Error(`Unknown variant: ${result.variant}`);

variant.results.push({
success: result.success,
durationMs: result.durationMs || null,
metadata: result.metadata || null,
recordedAt: new Date().toISOString(),
});

if (result.success) variant.successCount++;
else variant.failureCount++;

if (result.durationMs) variant.totalDurationMs += result.durationMs;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Require a boolean success and preserve 0 ms durations.

success === undefined accepts invalid values like 'false' or null, and then JS truthiness decides which counter moves. The same truthy checks drop legitimate 0 ms samples from the stored results and average-duration math.

Tighten the result contract
-    if (!result.experimentId || !result.variant || result.success === undefined) {
+    if (!result.experimentId || !result.variant || typeof result.success !== 'boolean') {
       throw new Error('Required fields: experimentId, variant, success');
     }
+    if (result.durationMs != null && (!Number.isFinite(result.durationMs) || result.durationMs < 0)) {
+      throw new Error('durationMs must be a non-negative number');
+    }
@@
-      durationMs: result.durationMs || null,
+      durationMs: result.durationMs ?? null,
@@
-    if (result.durationMs) variant.totalDurationMs += result.durationMs;
+    if (result.durationMs != null) variant.totalDurationMs += result.durationMs;
@@
-        v.results.filter((r) => r.durationMs).length > 0
-          ? v.totalDurationMs / v.results.filter((r) => r.durationMs).length
+        v.results.filter((r) => r.durationMs != null).length > 0
+          ? v.totalDurationMs / v.results.filter((r) => r.durationMs != null).length

Also applies to: 350-353

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.aios-core/core/memory/strategy-optimizer.js around lines 221 - 245, The
recordResult logic must enforce that result.success is strictly a boolean and
must preserve 0 values for durationMs; update recordResult to validate with
typeof result.success !== 'boolean' (throw if not boolean), push the result
using presence checks (use 'durationMs' in result or
Object.prototype.hasOwnProperty.call(result, "durationMs") to store durationMs
or null, and similarly for metadata), update counters with explicit checks (if
result.success === true increment successCount; else if result.success === false
increment failureCount), and add to variant.totalDurationMs only when durationMs
is present (i.e., not undefined/null) so that a 0 ms sample is included in
totals/averages; apply the same strict checks in the other occurrence of this
logic (the block around the second occurrence noted in the review).

Comment on lines +285 to +291
cancelExperiment(experimentId) {
const exp = this.experiments.find((e) => e.id === experimentId);
if (!exp) throw new Error(`Unknown experiment: ${experimentId}`);

exp.status = ExperimentStatus.CANCELLED;
exp.concludedAt = new Date().toISOString();
this.emit(Events.EXPERIMENT_CANCELLED, { id: exp.id, name: exp.name });
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Reject cancellation after an experiment has finished.

This currently lets a concluded or already-cancelled experiment be rewritten as cancelled, which corrupts its terminal state and emitted history. cancelExperiment() should only accept running experiments.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.aios-core/core/memory/strategy-optimizer.js around lines 285 - 291, The
cancelExperiment method currently allows cancelling any experiment state; change
it to only permit cancellation when the experiment is in
ExperimentStatus.RUNNING: inside cancelExperiment (on this.experiments.find
result) check exp.status === ExperimentStatus.RUNNING and throw a descriptive
Error if not, so concluded or already-cancelled experiments cannot be rewritten;
only after that set exp.status = ExperimentStatus.CANCELLED, exp.concludedAt =
new Date().toISOString(), and emit Events.EXPERIMENT_CANCELLED.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
tests/core/orchestration/skill-dispatcher.test.js (1)

418-436: Good test coverage for getAvailableAgents, but verify expected count.

The test at line 435 asserts exactly 11 primary agents. This is a fragile assertion that will break if primary agents are added or removed. Consider either:

  • Documenting why 11 is the expected count, or
  • Asserting agents.length > 0 with specific spot-checks instead of exact count.

That said, the current assertion does match the PRIMARY_AGENTS set in the source (11 entries), so this is a minor observation.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/core/orchestration/skill-dispatcher.test.js` around lines 418 - 436,
The assertion that agents.length === 11 is brittle; update the test for
getAvailableAgents to avoid depending on an exact count by either documenting
why 11 is expected (reference PRIMARY_AGENTS in source) or replacing the
exact-length check with a non-fragile assertion such as
expect(agents.length).toBeGreaterThan(0) combined with the existing spot-checks
(expect(...).toContain for 'architect','dev','qa','devops','aios-master') so the
test verifies presence of key primary agents without hard-coding the full set
size.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/core/orchestration/skill-dispatcher.test.js`:
- Around line 85-89: The test expectation for dispatcher.getAgentPersona is
wrong: update the test in tests/core/orchestration/skill-dispatcher.test.js so
the default persona title matches the implementation (change expected title from
'AIOS Agent' to 'AIOX Agent') to align with skill-dispatcher.js's returned
persona for unknown agents.
- Around line 32-41: Replace the misspelled brand identifier 'aios-master' with
the correct 'aiox-master' in the test's primary agents array inside the "has all
primary agents in skill mapping" test and the other occurrence in the
getAvailableAgents tests; update the string literal(s) so they match the source
PRIMARY_AGENTS value and ensure the assertions
(expect(dispatcher.skillMapping[agent]).toBeDefined()) and
getAvailableAgents-related checks reference 'aiox-master'.

---

Nitpick comments:
In `@tests/core/orchestration/skill-dispatcher.test.js`:
- Around line 418-436: The assertion that agents.length === 11 is brittle;
update the test for getAvailableAgents to avoid depending on an exact count by
either documenting why 11 is expected (reference PRIMARY_AGENTS in source) or
replacing the exact-length check with a non-fragile assertion such as
expect(agents.length).toBeGreaterThan(0) combined with the existing spot-checks
(expect(...).toContain for 'architect','dev','qa','devops','aios-master') so the
test verifies presence of key primary agents without hard-coding the full set
size.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 94f67aa2-0f19-4b27-88b3-89cf650c5a69

📥 Commits

Reviewing files that changed from the base of the PR and between 8f8252b and 4ce28c4.

📒 Files selected for processing (3)
  • .aiox-core/core/orchestration/skill-dispatcher.js
  • .aiox-core/install-manifest.yaml
  • tests/core/orchestration/skill-dispatcher.test.js

Comment on lines +32 to +41
test('has all primary agents in skill mapping', () => {
const primary = [
'architect', 'data-engineer', 'dev', 'qa',
'pm', 'po', 'sm', 'analyst',
'ux-design-expert', 'devops', 'aios-master',
];
for (const agent of primary) {
expect(dispatcher.skillMapping[agent]).toBeDefined();
}
});
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Fix brand name typo: aios-masteraiox-master.

The primary agents array on line 36 uses 'aios-master', but the source's PRIMARY_AGENTS set uses 'aiox-master'. This typo causes the assertion on line 39 to fail, as shown in the pipeline logs.

The same typo appears on line 426 in getAvailableAgents tests.

🐛 Proposed fix
       const primary = [
         'architect', 'data-engineer', 'dev', 'qa',
         'pm', 'po', 'sm', 'analyst',
-        'ux-design-expert', 'devops', 'aios-master',
+        'ux-design-expert', 'devops', 'aiox-master',
       ];

Also fix line 426:

-      expect(agents).toContain('aios-master');
+      expect(agents).toContain('aiox-master');
🧰 Tools
🪛 GitHub Actions: CI

[error] 39-39: AssertionError: expect(received).toBeDefined() | Received: undefined

🪛 GitHub Actions: PR Automation

[error] 39-39: expect(dispatcher.skillMapping[agent]).toBeDefined() // primary agents in skill mapping

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/core/orchestration/skill-dispatcher.test.js` around lines 32 - 41,
Replace the misspelled brand identifier 'aios-master' with the correct
'aiox-master' in the test's primary agents array inside the "has all primary
agents in skill mapping" test and the other occurrence in the getAvailableAgents
tests; update the string literal(s) so they match the source PRIMARY_AGENTS
value and ensure the assertions
(expect(dispatcher.skillMapping[agent]).toBeDefined()) and
getAvailableAgents-related checks reference 'aiox-master'.

Comment on lines +85 to +89
test('returns default persona for unknown agents', () => {
expect(dispatcher.getAgentPersona('unknown')).toEqual({
name: 'unknown', title: 'AIOS Agent',
});
});
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Fix default persona title: AIOS AgentAIOX Agent.

The test expects title: 'AIOS Agent' but the source code at line 130 of skill-dispatcher.js returns title: 'AIOX Agent' for unknown agents. This mismatch causes the pipeline failure.

🐛 Proposed fix
     test('returns default persona for unknown agents', () => {
       expect(dispatcher.getAgentPersona('unknown')).toEqual({
-        name: 'unknown', title: 'AIOS Agent',
+        name: 'unknown', title: 'AIOX Agent',
       });
     });
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
test('returns default persona for unknown agents', () => {
expect(dispatcher.getAgentPersona('unknown')).toEqual({
name: 'unknown', title: 'AIOS Agent',
});
});
test('returns default persona for unknown agents', () => {
expect(dispatcher.getAgentPersona('unknown')).toEqual({
name: 'unknown', title: 'AIOX Agent',
});
});
🧰 Tools
🪛 GitHub Actions: CI

[error] 86-86: Expected object with title "AIOS Agent", but received { "name": "unknown", "title": "AIOX Agent" }

🪛 GitHub Actions: PR Automation

[error] 86-86: expect(dispatcher.getAgentPersona('unknown')).toEqual({ name: 'unknown', title: 'AIOX Agent' })

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/core/orchestration/skill-dispatcher.test.js` around lines 85 - 89, The
test expectation for dispatcher.getAgentPersona is wrong: update the test in
tests/core/orchestration/skill-dispatcher.test.js so the default persona title
matches the implementation (change expected title from 'AIOS Agent' to 'AIOX
Agent') to align with skill-dispatcher.js's returned persona for unknown agents.

Story 9.7 of Epic 9 (Persistent Memory Layer).

The Strategy Optimizer enables data-driven strategy selection through
controlled A/B experiments:
- Create experiments with multiple strategy variants
- Weighted random variant assignment for tasks
- Record success/failure results with duration metrics
- Statistical significance testing (proportion z-test)
- Auto-conclude experiments and promote winners
- Maintain registry of best strategies per task type
- Persist experiments in .aiox/strategy-experiments.json

41 unit tests covering experiments, results, auto-conclusion,
variant assignment, persistence, and statistics.
- isValidAgent: guard against null/undefined (was crashing)
- techStack assertion: toBeDefined -> toEqual with exact shape
- timestamp assertion: verify it parses as a valid ISO date
- add null/undefined test cases for isValidAgent
- update AIOS: prefix to AIOX: across tests
@nikolasdehor nikolasdehor force-pushed the feat/strategy-optimizer branch from 4ce28c4 to 81eddd5 Compare March 11, 2026 02:25
@codecov
Copy link

codecov bot commented Mar 11, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
tests/core/memory/strategy-optimizer.test.js (2)

8-11: Use absolute imports instead of relative imports.

The import path uses a relative path (../../../.aios-core/core/memory/strategy-optimizer), which violates the coding guidelines.

Consider configuring module resolution or using absolute imports as per project standards.

♻️ Suggested fix
 const fs = require('fs');
 const path = require('path');
-const StrategyOptimizer = require('../../../.aios-core/core/memory/strategy-optimizer');
+const StrategyOptimizer = require('@aios-core/core/memory/strategy-optimizer');
 const { ExperimentStatus, Events, CONFIG } = StrategyOptimizer;

As per coding guidelines: "Use absolute imports instead of relative imports in all code".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/core/memory/strategy-optimizer.test.js` around lines 8 - 11, The test
currently requires StrategyOptimizer via a relative path; replace the relative
require of '../../../.aios-core/core/memory/strategy-optimizer' with the
project's canonical absolute module import (the module name used elsewhere for
the core memory StrategyOptimizer) so the line uses an absolute import for
StrategyOptimizer (and its exports ExperimentStatus, Events, CONFIG); if module
resolution isn't configured, update the project's module resolution (e.g.,
Node/webpack/tsconfig paths) so the absolute import resolves correctly and run
tests to confirm imports succeed.

220-231: Consider using the module's API to set concluded state.

The test directly mutates internal state (exp.status and exp.winner) to simulate a concluded experiment. While this works for testing the edge case, it couples the test to the internal implementation.

If the production code provides a method to manually conclude an experiment (or if cancelExperiment could be adapted), using that would be more robust. Otherwise, this approach is acceptable for testing this specific behavior.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/core/memory/strategy-optimizer.test.js` around lines 220 - 231, The
test mutates internal fields exp.status and exp.winner directly; change it to
use the module API to mark the experiment concluded and set its winner (for
example call a provided method such as opt.concludeExperiment(exp.id, { winner:
'winner' }) or opt.updateExperimentStatus/metadata) instead of assigning
exp.status = ExperimentStatus.CONCLUDED and exp.winner = 'winner', then assert
opt.assignVariant(exp.id) returns 'winner'.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@tests/core/memory/strategy-optimizer.test.js`:
- Around line 8-11: The test currently requires StrategyOptimizer via a relative
path; replace the relative require of
'../../../.aios-core/core/memory/strategy-optimizer' with the project's
canonical absolute module import (the module name used elsewhere for the core
memory StrategyOptimizer) so the line uses an absolute import for
StrategyOptimizer (and its exports ExperimentStatus, Events, CONFIG); if module
resolution isn't configured, update the project's module resolution (e.g.,
Node/webpack/tsconfig paths) so the absolute import resolves correctly and run
tests to confirm imports succeed.
- Around line 220-231: The test mutates internal fields exp.status and
exp.winner directly; change it to use the module API to mark the experiment
concluded and set its winner (for example call a provided method such as
opt.concludeExperiment(exp.id, { winner: 'winner' }) or
opt.updateExperimentStatus/metadata) instead of assigning exp.status =
ExperimentStatus.CONCLUDED and exp.winner = 'winner', then assert
opt.assignVariant(exp.id) returns 'winner'.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: fa4f47b2-c0d3-46da-a6e6-9bab129164a3

📥 Commits

Reviewing files that changed from the base of the PR and between 4ce28c4 and 81eddd5.

📒 Files selected for processing (5)
  • .aios-core/core/memory/strategy-optimizer.js
  • .aiox-core/core/orchestration/skill-dispatcher.js
  • .aiox-core/install-manifest.yaml
  • tests/core/memory/strategy-optimizer.test.js
  • tests/core/orchestration/skill-dispatcher.test.js
✅ Files skipped from review due to trivial changes (1)
  • tests/core/orchestration/skill-dispatcher.test.js
🚧 Files skipped from review as they are similar to previous changes (2)
  • .aiox-core/core/orchestration/skill-dispatcher.js
  • .aios-core/core/memory/strategy-optimizer.js

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants